James L McclellandStanford University | SU · Department of Psychology
James L Mcclelland
PhD, Cognitive Psychology, University of Pennsylvania
About
376
Publications
197,764
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
95,478
Citations
Introduction
I study perception, memory, and learning, with a current focus on learning to use mathematical concepts effectively. I view the representations we construct and use as emergent consequences of interactions among simple processing units. We model these processes using a variety of tools, including artificial neural networks and mathematical models of reduced descriptions of the activity of populations of units. Visit my papers page: http://psych.stanford.edu/~jlm/papers
Additional affiliations
January 2011 - present
January 2011 - present
January 2010 - present
Publications
Publications (376)
reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks but exhibit many imperfections. However, human abstract reasoning is also imperfect. Human reasoning is affected by our real-world knowledge and beliefs, and shows notable “content effects”; humans reason mo...
The Cambridge Handbook of Computational Cognitive Sciences is a comprehensive reference for this rapidly developing and highly interdisciplinary field. Written with both newcomers and experts in mind, it provides an accessible introduction of paradigms, methodologies, approaches, and models, with ample detail and illustrated by examples. It should...
reasoning is a key ability for an intelligent system. Large language models achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect, and depends on our knowledge and beliefs about the content of the reasoning problem. For example, humans reason much more relia...
Large language models can perform new tasks by adapting to a few in-context examples. For humans, rapid learning from examples can benefit from explanations that connect examples to task principles. We therefore investigate whether explanations of few-shot examples can allow language models to adapt more effectively. We annotate a set of 40 challen...
Explanations play a considerable role in human learning, especially in areas thatremain major challenges for AI -- forming abstractions, and learning about the re-lational and causal structure of the world. Here, we explore whether reinforcement learning agents might likewise benefit from explanations. We outline a family of relational tasks that i...
Significance
An intelligent system should be able to adapt to a novel task without any data and achieve at least moderate success. Humans can often do so, while models often require immense datasets to reach human-level performance. We propose a general computational framework by which models can adapt to new tasks based only on their relationship...
An important aspect of intelligence is the ability to adapt to a novel task without any direct experience (zero-shot), based on its relationship to previous tasks. Humans can exhibit this cognitive flexibility. By contrast, deep-learning models that achieve superhuman performance in specific tasks generally fail to adapt to even slight task alterat...
According to complementary learning systems theory, integrating new memories into the neocortex of the brain without interfering with what is already known depends on a gradual learning process, interleaving new items with previously learned items. However, empirical studies show that information consistent with prior knowledge can sometimes be int...
After learning a concept, humans are also able to continually generalize their learned concepts to new domains by observing only a few labeled instances without any interference with the past learned knowledge. In contrast, learning concepts efficiently in a continual learning setting remains an open challenge for current Artificial Intelligence al...
We present new evidence about illusory conjunctions (ICs) suggesting that their current explanation requires revision. According to Feature Integration Theory (FIT; Treisman & Gelade Cognitive Psychology, 12, 97–136, 1980), focal attention to a single stimulus is required to bind its features into an integrated percept. FIT predicts that if attenti...
Humans are sensitive to the properties of individual items, and exemplar models are useful for capturing this sensitivity. I am a proponent of an extension of exemplar-based architectures that I briefly describe. However, exemplar models are very shallow architectures in which it is necessary to stipulate a set of primitive elements that make up ea...
According to complementary learning systems theory, integrating new memories into the neocortex of the brain without interfering with what is already known depends on a gradual learning process, interleaving new items with previously learned items. However, empirical studies show that information consistent with prior knowledge can be integrated ve...
We argue that natural language can be usefully described as quasi-compositional and we suggest that deep learning-based neural language models bear long-term promise to capture how language conveys meaning. We also note that a successful account of human language processing should explain both the outcome of the comprehension process and the contin...
Prominent theories of value-based decision making have assumed that choices are made via the maximization of some objective function (e.g. expected value) and that the process of decision making is serial and unfolds across modular sub-processes (e.g. perception, valuation, and action selection). However, the influence of a large number of contextu...
After learning a concept, humans are also able to continually generalize their learned concepts to new domains by observing only a few labeled instances without any interference with the past learned knowledge. In contrast, learning concepts efficiently in a continual learning setting remains an open challenge for current Artificial Intelligence al...
How can deep learning systems flexibly reuse their knowledge? Toward this goal, we propose a new class of challenges, and a class of architectures that can solve them. The challenges are meta-mappings, which involve systematically transforming task behaviors to adapt to new tasks zero-shot. We suggest that the key to achieving these challenges is r...
Significance
Over the course of development, humans learn myriad facts about items in the world, and naturally group these items into useful categories and structures. This semantic knowledge is essential for diverse behaviors and inferences in adulthood. How is this richly structured semantic knowledge acquired, organized, deployed, and represente...
An extensive body of empirical research has revealed remarkable regularities in the acquisition, organization, deployment, and neural representation of human semantic knowledge, thereby raising a fundamental conceptual question: what are the theoretical principles governing the ability of neural networks to acquire, organize, and deploy abstract kn...
The N400 component of the event-related brain potential has aroused much interest because it is thought to provide an online measure of meaning processing in the brain. However, the underlying process remains incompletely understood and actively debated. Here we present a computationally explicit account of this process and the emerging representat...
The N400 component of the event-related brain potential has aroused much interest because it is thought to provide an online measure of meaning processing in the brain. Yet, the underlying process remains incompletely understood and actively debated. Here, we present a computationally explicit account of this process and the emerging representation...
Semantic cognition requires conceptual representations shaped by verbal and nonverbal experience and executive control processes that regulate activation of knowledge to meet current situational demands. A complete model must also account for the representation of concrete and abstract words, of taxonomic and associative relationships, and for the...
Mapping numbers onto space is foundational to mathematical cognition. These cognitive operations are often conceptualized in the context of a “mental number line” and involve multiple brain regions in or near the intraparietal sulcus (IPS) that have been implicated both in numeral and spatial cognition. Here we examine possible differentiation of f...
Previous research has found that different presentations of the same concept can result in different patterns of transfer to isomorphic instances of that concept. Much of this work has framed these effects in terms of advantages and disadvantages of concreteness or abstractness. We note that mathematics is a richly structured field, with deeply int...
We agree with the authors that putting forward specific models and examining their agreement with experimental data are the best approach for understanding the nature of decision making. Although the authors only consider the likelihood function, prior, cost function, and decision rule (LPCD) framework, other choices are available. Bayesian statist...
Lake et al. propose that people rely on “start-up software,” “causal models,” and “intuitive theories” built using compositional representations to learn new tasks more efficiently than some deep neural network models. We highlight the many drawbacks of a commitment to compositional representations and describe our continuing effort to explore how...
Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax an...
We combine extant theories of evidence accumulation and multi-modal integration to develop an integrated framework for modeling multimodal integration as a process that unfolds in real time. Many studies have formulated sensory processing as a dynamic process where noisy samples of evidence are accumulated until a decision is made. However, these s...
How best can we understand – and visualize – the structure in multi-dimensional data? One common approach is to rely on hierarchical cluster analysis, either for theoretical or for more descriptive reasons. Here, we point out that an apparently revealing hierarchical clustering solution may well be compatible with structure that is not well charact...
The N400 component of the event-related brain potential is widely used in research on language and semantic memory, but the cognitive functions underlying N400 amplitudes are still unclear and actively debated. Recent simulations with a neural network model of word meaning suggest that N400 amplitudes might reflect implicit semantic prediction erro...
We update complementary learning systems (CLS) theory, which holds that intelligent agents must possess two learning systems, instantiated in mammalians in neocortex and hippocampus. The first gradually acquires structured knowledge representations while the second quickly learns the specifics of individual experiences. We broaden the role of repla...
Cognitive neuroscience explores the neural basis of cognition, including perception, attention, language understanding, memory, problem solving, and decision-making. The field draws on findings on how neurons process and represent information, and on ideas about how learning may occur through the modification of properties of neurons and their conn...
This article details a correction to the article: Steingroever, H. et al., (2015). Data from 617 Healthy Participants Performing the Iowa Gambling Task: A “Many Labs” Collaboration. Journal of Open Psychology Data. 3:e5. DOI: http://doi.org/10.5334/jopd.ak
Unlabelled:
We used electroencephalography (EEG) and behavior to examine the role of payoff bias in a difficult two-alternative perceptual decision under deadline pressure in humans. The findings suggest that a fast guess process, biased by payoff and triggered by stimulus onset, occurred on a subset of trials and raced with an evidence accumulati...
The resilient properties of the language-like gestural systems created by deaf children of hearing parents and the gestures of normal children that occur during language acquisition are interesting and require explanation. This commentary offers the suggestion that these gestures reflect robust properties, not so much of language, but of thought an...
This data pool (N = 617) comes from 10 studies assessing performance of healthy participants (i.e., no known neurological impairments) on the Iowa gambling task (IGT)—a task measuring decision making under uncertainty in an experimental context. Participants completed a computerized version of the IGT consisting of 95 – 150 trials. The data consist...
The field of formal linguistics was founded on the premise that language is mentally represented as a deterministic symbolic grammar. While this approach has captured many important characteristics of the world's languages, it has also led to a tendency to focus theoretical questions on the correct formalization of grammatical rules while also de-e...
McClelland will describe computational modeling research indicating how experience with both formulas and concrete problems can shape perception and intuition about mathematical expressions and their meaning. The models rely on intrinsically gradual learning in neural networks, explaining why acquisition of mathematical intuition can be slow and ad...
One vision of the nature of language holds that a language consists of a set of symbolic unit types, and a set of units of each type, together with a set of grammatical principles that constrain how these units can be used to compose other units, and a system of rules that project structured arrangements of such units onto other structured arrangem...
Recent advancements in Bayesian modeling have allowed for likelihood-free posterior estimation. Such estimation techniques are crucial to the understanding of simulation-based models, whose likelihood functions may be difficult or even impossible to derive. One particular class of simulation-based models that have not yet benefited from the progres...
Connectionism is a computational modeling framework inspired by the principles of information processing that characterize biological neural systems, which rely on collections of simple processing units linked together into networks. These units communicate in parallel via connections of varying strength that can be modified by experience. Connecti...
An influential position in lexical semantics holds that semantic representations for words can be derived through analysis of patterns of lexical co-occurrence in large language corpora. Frith (1957) famously summarised this principle as "you shall know a word by the company it keeps". We explored whether the same principle could be applied to non-...
This paper introduces a special issue of Cognitive Science initiated on the 25th anniversary of the publication of Parallel Distributed Processing (PDP), a two-volume work that introduced the use of neural network models as vehicles for understanding cognition. The collection surveys the core commitments of the PDP framework, the key issues the fra...
In a seminal 1977 article, Rumelhart argued that perception required the simultaneous use of multiple sources of information, allowing perceivers to optimally interpret sensory information at many levels of representation in real time as information arrives. Building on Rumelhart's arguments, we present the Interactive Activation hypothesis—the ide...
Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks....
Despite the widespread practical success of deep learning methods, our
theoretical understanding of the dynamics of learning in deep neural networks
remains quite sparse. We attempt to bridge the gap between the theory and
practice of deep learning by systematically analyzing learning dynamics for the
restricted case of deep linear neural networks....
We present a PDP model of binary choice verbal analogy problems (A:B as C:[D1|D2], where D1 and D2 represent choice alternatives). We train a recurrent neural network in item-relation-item triples and use this network to test performance on analogy questions. Without training on analogy problems per se, the model explains the developmental shift fr...
The complementary learning systems theory of the roles of hippocampus and neocortex (McClelland, McNaughton, & O'Reilly, 1995) holds that the rapid integration of arbitrary new information into neocortical structures is avoided to prevent catastrophic interference with structured knowledge representations stored in synaptic connections among neocor...
This article seeks to establish a rapprochement between explicitly Bayesian models of contextual effects in perception and neural network models of such effects, particularly the connectionist interactive activation (IA) model of perception. The article is in part an historical review and in part a tutorial, reviewing the probabilistic Bayesian app...
Examination of certain illusory conjunction (IC) errors may provide insight into the mechanisms of object recognition when multiple stimuli are attended. An IC error occurs when a subject reports a stimulus that was not present but that combines features of target and distractor stimuli. While ICs between nearby stimuli have been frequently studied...
Human and animal lesion studies have shown that behavior can be catastrophically impaired after bilateral lesions but that unilateral damage often produces little or no effect, even controlling for lesion extent. This pattern is found across many different sensory, motor, and memory domains. Despite these findings, there has been no systematic, com...
Differentiation models of recognition memory predict a strength-based mirror effect in the distributions of subjective memory strength. Subjective memory strength should increase for targets and simultaneously decrease for foils following a strongly encoded list compared with a weakly encoded list. An alternative explanation for the strength-based...
Topics include: local and distributed representation in connectionist networks; a distributed auto-associator model of memory; connectionist vs symbolic models of semantic memory (training the network with back propagation, cognitive and developmental implications); catastrophic interference and complementary systems in memory; and current directio...
Robert Duncan Luce, a mathematician who sought to provide axiomatic formulations for the social sciences, died on 11 August 2012 in Irvine, California, at the age of 87. His passing was marked by an outpouring of sadness at the loss of a revered colleague and expressions of veneration for his many substantive, institutional, and personal contributi...
An illusory conjunction (IC) can be defined as a perceptual error in which a subject reports a stimulus that did not appear but that combines features of the stimuli that were present. Pelli, Palomares, & Majaj (2004) noted that many IC studies use stimuli whose target-flanker proximity falls within the critical spacing for crowding. For example, P...
In this article, we present a perspective on the role of the hippocampal system in generalization, instantiated in a computational model called REMERGE (recurrency and episodic memory results in generalization). We expose a fundamental, but neglected, tension between prevailing computational theories that emphasize the function of the hippocampus i...
When people make decisions, do they give equal weight to evidence arriving at different times? A recent study (Kiani et al., 2008) using brief motion pulses (superimposed on a random moving dot display) reported a primacy effect: pulses presented early in a motion observation period had a stronger impact than pulses presented later. This observatio...
Though improvement is evident, in no case is final performance native English-like. We focused our training on the third formant onset frequency, shown to be the most reliable indicator of /r-l/ category membership. We first presented listeners with instances of synthetic /r-l/ stimuli varying only in F3 onset frequency, in a forced-choice identifi...
Many attempts have been made to teach native Japanese listeners to perceptually differentiate English/r-l/(e.g. rock-lock). Though improvement is evident, in no case is final performance native English-like. We focused our training on the third formant onset frequency, shown to be the most reliable indicator of/r-l/category membership. We first pre...
How do humans learn contingencies between events? Both pathway-strengthening and inference-based process models have been proposed to explain contingency learning. We propose that each of these processes is used in different conditions. Participants viewed displays that contained single or paired objects and learned which displays were usually foll...
A recent article shows that a change in a single parameter in a neural-network model of brain dynamics leads to repetitive behaviors that resist termination and towards which the network tends. These findings may have implications for obsessive-compulsive disorder and are consistent with evidence of glutamatergic hyperactivity in this disorder.
Action selection is the task of doing the right thing at the right time. It requires the assessment of available alternatives, executing those most appropriate, and resolving conflicts among competing goals and possibilities. Using advanced computational modelling, this book explores cutting-edge research into action selection in nature from a wide...
This study tested the predictions of the Speech Learning Model (SLM, Flege, 1988) on the case of native Japanese (NJ) speakers' perception and production of English /ɹ / and /l/. NJ speakers' degree of foreign accent, intelligibility of /ɹ -l/ productions, and ability to perceive natural speech /ɹ -l/ were assessed as a function of length of reside...
Many people have the subjective sense of being able to see more than one object at a time. However, given the large receptive fields of neurons in the later stages of the ventral visual pathway, it is unclear how two similar objects could be perceived without interfering with each other. It has been proposed that the concurrent perception of multip...
Book Synopsis:
The book combines introductory chapters, detailed case studies, and commentaries from leading scholars in the field.
There is a strong focus on the processes and mechanisms underlying developmental change. Thus, the book should be relevant to a broad readership in developmental science.
Commentaries are included from researchers...
Interest in the nature of conceptual knowledge extends back at least to the ancient Greek philosophers. In recent years, there has been a wide range of different approaches to understanding the nature of conceptual knowledge, its development, and its neural basis. In most other work, however, these issues are not all treated together. Instead, work...
Illusory conjunctions in normal and simultanagnosic subjects are two instances where the visual features of multiple objects are incorrectly ‘bound’ together. A connectionist model explores how multiple objects could be perceived correctly in normal subjects given sufficient time, but could give rise to illusory conjunctions with damage or time pre...
Recent research has investigated the process of integrating perceptual evidence toward a decision, converging on a number of sequential sampling choice models, such as variants of race and diffusion models and the non-linear leaky competing accumulator (LCA) model. Here we study extensions of these models to multi-alternative choice, considering ho...
Here we consider how reward might influence choice behavior in the leak-dominant regime of the leaky competing accumulator model, examining the same three hypotheses considered in the main text for the inhibition-dominant regime. Although the data from the reported experiment are treated as arising within the inhibition-dominant regime, we include...
Here, we present the Iso-Criterion analysis of the data. For each delay condition, we plot the stimulus sensitivity and the decision criterion variable representing the degree of reward bias individually for each of the three difficulty levels. The results are generally consistent with the hypothesis that the participants are adopting a common crit...
In the linear version of the leaky competing accumulator model, exactly the same pattern of choice behavior can be predicted in either leak- or inhibition-dominance with proper parameter values. Here, we demonstrate this result and show the relationship between the two parameter sets in the two regimes.
(PDF)
In perceptual decision-making, ideal decision-makers should bias their choices toward alternatives associated with larger rewards, and the extent of the bias should decrease as stimulus sensitivity increases. When responses must be made at different times after stimulus onset, stimulus sensitivity grows with time from zero to a final asymptotic lev...
The process of constructing concepts underpins our capacity to encode information in an efficient and competent manner and also, ultimately, our ability to think in terms of abstract ideas such as justice, love and happiness. But what are the mechanisms which correspond to psychological categorization processes? This book unites many prominent appr...
What is the underlying representation of lexical knowledge? How do we know whether a given string of letters is a word, whereas another string of letters is not? There are two competing models of lexical processing in the literature. The first proposes that we rely on mental lexicons. The second claims there are no mental lexicons; we identify cert...
The study of human intelligence was once dominated by symbolic approaches, but over the last 30 years an alternative approach has arisen. Symbols and processes that operate on them are often seen today as approximate characterizations of the emergent consequences of sub- or non- symbolic processes, and a wide range of constructs in cognitive scienc...
This chapter offers a scientific theory of the nature of human memory that fits naturally with the view of memory as a constructive process. This theory, the complementary learning systems theory, is grounded in a broad framework for understanding human cognitive processes called the parallel distributed processing (PDP) framework, a framework the...
Connectionist and dynamical systems approaches explain human thought, language and behavior in terms of the emergent consequences of a large number of simple noncognitive processes. We view the entities that serve as the basis for structured probabilistic approaches as abstractions that are occasionally useful but often misleading: they have no rea...
Single neurons in cortical area LIP are known to carry information relevant to both sensory and value-based decisions that are reported by eye movements. It is not known, however, how sensory and value information are combined in LIP when individual decisions must be based on a combination of these variables. To investigate this issue, we conducted...