Book

Semantic Cognition: A Parallel Distributed Processing Approach

Authors:

Abstract

This groundbreaking monograph offers a mechanistic theory of the representation and use of semantic knowledge, integrating the strengths and overcoming many of the weaknesses of hierarchical, categorization-based approaches, similarity-based approaches, and the approach often called "theory theory." Building on earlier models by Geoffrey Hinton in the 1980s and David Rumelhart in the early 1990s, the authors propose that performance in semantic tasks arises through the propagation of graded signals in a system of interconnected processing units. The representations used in performing these tasks are patterns of activation across units, governed by weighted connections among them. Semantic knowledge is acquired through the gradual adjustment of the strengths of these connections in the course of day-to-day experience. The authors show how a simple computational model proposed by Rumelhart exhibits a progressive differentiation of conceptual knowledge, paralleling aspects of cognitive development seen in the work of Frank Keil and Jean Mandler. The authors extend the model to address aspects of conceptual knowledge acquisition in infancy, disintegration of conceptual knowledge in dementia, "basic-level" effects and their interaction with expertise, and many findings introduced to support the idea that semantic cognition is guided by naive, domain-specific theories. Bradford Books imprint
... An alternative approach is to learn implicit rather than explicit structural organization. Rogers and McClelland (2004) studied a neural network that learns to map animals (like canary) and relations (can) to output attributes that a canary can do (grow, move, fly, and sing). Like the structural forms approach, the neural network learns aggregate statistical structure from the observations. ...
... Like the structural forms approach, the neural network learns aggregate statistical structure from the observations. Rogers and McClelland (2004) analyzed their network's learned representation through dimensionality reduction, projecting each living thing into a low-dimensional representational space. ...
... The structural forms model has been criticized by the emergent camp for lacking the necessary flexibility for many real domains, which often stray from pristine forms . The importance of flexibility has motivated emergent alternatives, such as a connectionist network that maps animals and relations on the input side to attributes on the output side (Rogers & McClelland, 2004). As this model learns, an implicit tree structure emerges in its distributed representations. ...
Preprint
Both scientists and children make important structural discoveries, yet their computational underpinnings are not well understood. Structure discovery has previously been formalized as probabilistic inference about the right structural form --- where form could be a tree, ring, chain, grid, etc. [Kemp & Tenenbaum (2008). The discovery of structural form. PNAS, 105(3), 10687-10692]. While this approach can learn intuitive organizations, including a tree for animals and a ring for the color circle, it assumes a strong inductive bias that considers only these particular forms, and each form is explicitly provided as initial knowledge. Here we introduce a new computational model of how organizing structure can be discovered, utilizing a broad hypothesis space with a preference for sparse connectivity. Given that the inductive bias is more general, the model's initial knowledge shows little qualitative resemblance to some of the discoveries it supports. As a consequence, the model can also learn complex structures for domains that lack intuitive description, as well as predict human property induction judgments without explicit structural forms. By allowing form to emerge from sparsity, our approach clarifies how both the richness and flexibility of human conceptual organization can coexist.
... Structure in the data distribution has long been recognized as central to the development of internal structure in artificial and biological neural networks (Rumelhart et al., 1986;Olshausen & Field, 1996;Rogers & McClelland, 2004). Recent observations have renewed interest in this topic: language models progress through distinct stages of development during training, acquiring increasingly sophisticated linguistic and reasoning abilities in ways that seem to reflect the structure of the data distribution (Olsson et al., 2022;Chen et al., 2024;Belrose et al., 2024;Tigges et al., 2024;Edelman et al., 2024;Hoogland et al., 2024). ...
... Data distributional structure. It is clear that structure in the data distribution plays a significant role in the kinds of structures learned in neural networks and how they are learned (Rumelhart et al., 1986;Olshausen & Field, 1996;Rogers & McClelland, 2004). For instance, properties of the data distribution have been linked to the emergence of in-context learning by Chan et al. (2022b), and Belrose et al. (2024) note that networks learn lower-order moments before higher-order ones. ...
... In this paper: the learning coefficient (or real log canonical threshold, as it is known in geometry). • Learning process structure: developmental stages, critical periods, and the sequence in which different capabilities or internal structures emerge (Rogers & McClelland, 2004). In this paper: the overall developmental stages of (Hoogland et al., 2024) and the staggered development of individual attention heads. ...
Preprint
Full-text available
We introduce refined variants of the Local Learning Coefficient (LLC), a measure of model complexity grounded in singular learning theory, to study the development of internal structure in transformer language models during training. By applying these \textit{refined LLCs} (rLLCs) to individual components of a two-layer attention-only transformer, we gain novel insights into the progressive differentiation and specialization of attention heads. Our methodology reveals how attention heads differentiate into distinct functional roles over the course of training, analyzes the types of data these heads specialize to process, and discovers a previously unidentified multigram circuit. These findings demonstrate that rLLCs provide a principled, quantitative toolkit for \textit{developmental interpretability}, which aims to understand models through their evolution across the learning process. More broadly, this work takes a step towards establishing the correspondence between data distributional structure, geometric properties of the loss landscape, learning dynamics, and emergent computational structures in neural networks.
... A key finding is that directions in the network function are learned in order of importance [9,10]. This phenomenon, known as progressive differentiation, connects modern deep learning theory to both to human child development and to the earliest connectionist models of semantic cognition [11,12]. ...
... Despite their linearity these models display complex non-linear learning dynamics which reflect behaviours seen in non-linear models [10]. Moreover, learning dynamics in such simple models have been argued to qualitatively resemble phenomena observed in the cognitive development of humans [10,11]. ...
... Setup. The hierarchical learning task has previously been used extensively in the study of semantic cognition [11] and requires learners to develop a hierarchical one-to-many mapping as seen in Fig. 1B. We adapted the task for human learners while maintaining the underlying structure: Input stimuli were represented as different classes of planets and output labels were represented as a set of plant images (see Fig. 1E and Fig. 8). ...
Preprint
Deep neural networks learn increasingly complex functions over the course of training. Here, we show both empirically and theoretically that learning of the target function is preceded by an early phase in which networks learn the optimal constant solution (OCS) - that is, initial model responses mirror the distribution of target labels, while entirely ignoring information provided in the input. Using a hierarchical category learning task, we derive exact solutions for learning dynamics in deep linear networks trained with bias terms. Even when initialized to zero, this simple architectural feature induces substantial changes in early dynamics. We identify hallmarks of this early OCS phase and illustrate how these signatures are observed in deep linear networks and larger, more complex (and nonlinear) convolutional neural networks solving a hierarchical learning task based on MNIST and CIFAR10. We explain these observations by proving that deep linear networks necessarily learn the OCS during early learning. To further probe the generality of our results, we train human learners over the course of three days on the category learning task. We then identify qualitative signatures of this early OCS phase in terms of the dynamics of true negative (correct-rejection) rates. Surprisingly, we find the same early reliance on the OCS in the behaviour of human learners. Finally, we show that learning of the OCS can emerge even in the absence of bias terms and is equivalently driven by generic correlations in the input data. Overall, our work suggests the OCS as a universal learning principle in supervised, error-corrective learning, and the mechanistic reasons for its prevalence.
... From an early age, we learn to navigate our surroundings, identify objects, and anticipate their behaviors 12,45 . With experience and learning, we understand physical reality through abstract and conceptual thinking, that is, tool-related semantic knowledge [46][47][48] . Moving more ventrally from the parietal and frontal regions described above, several studies have indicated how tool-related semantic knowledge involves extensive regions of the temporal cortex, mainly the left middle and inferior temporal lobe 17,18,32,[49][50][51] . ...
... At the first level of analysis, namely, irrespective of the experimental condition, semantically consistent object-tool pairs co-activated regions localized in the left inferior-and middletemporal regions (i.e., left aITG, pITG, pMTG, toMTG), while semantically inconsistent pairs co-activated left frontal regions (i.e., IFG) in addition to left medial and inferior temporal areas (i.e., aITG, pITG, aMTG, pMTG, toMTG). Such a temporalcentered neural circuitry seems to reflect the predictions of most recent models of semantic cognition, which highlights how the temporal lobe may serve as a hub for the cross-modal representation of semantic knowledge by bringing together various sources of information 46,48 . Notably, the encoding of stimuli in the temporal cortex is inconsistent throughout its different subregions. ...
... Globally, the overlap in the left temporal and frontal regions we reported is particularly interesting when viewed through the lens of the so-called Hub-and-Spoke hypothesis of semantic knowledge 46,48 . This hypothesis attempts to connect the gap between modal and a-modal theories of conceptual knowledge, proposing how semantic knowledge may be formed by Fig. 1 Experimental stimuli, design, and results. ...
Article
Full-text available
Tool-use skills represent a significant cognitive leap in human evolution, playing a crucial role in the emergence of complex technologies. Yet, the neural mechanisms underlying such capabilities are still debated. Here we explore with fMRI the functional brain networks involved in tool-related action understanding. Participants viewed images depicting action-consistent (e.g., nail-hammer) and action-inconsistent (e.g., scarf-hammer) object-tool pairs, under three conditions: semantic (recognizing the tools previously seen in the pairs), mechanical (assessing the usability of the pairs), and control (looking at the pairs without explicit tasks). During the observation of the pairs, task-based left-brain functional connectivity differed within conditions. Compared to the control, both the semantic and mechanical conditions exhibited co-activations in dorsal (precuneus) and ventro-dorsal (inferior frontal gyrus) regions. However, the semantic condition recruited medial and posterior temporal areas, whereas the mechanical condition engaged inferior parietal and posterior temporal regions. Also, when distinguishing action-consistent from action-inconsistent pairs, an extensive frontotemporal neural circuit was activated. These findings support recent accounts that view tool-related action understanding as the combined product of semantic and mechanical knowledge. Furthermore, they emphasize how the left inferior parietal and anterior temporal lobes might be considered as hubs for the cross-modal integration of physical and conceptual knowledge, respectively.
... The more frequently semantic items are processed, the stronger the connections become, meaning that less activity is required in the network to obtain the information (Lambon Ralph et al., 2016;Rogers et al., 2004;Rogers and McClelland, 2004). Therefore, a stronger interconnected ATL, could also possibly explain the decreased spontaneous brain activity in the right sATL in anxiety disorders (Wang et al., 2022). ...
... The more frequently semantic items are processed, the stronger the connections become, meaning that less activity is required in the network to obtain the information (Lambon Ralph et al., 2016;Rogers et al., 2004;Rogers and McClelland, 2004). Therefore, a stronger interconnected ATL, could also possibly explain the decreased spontaneous brain activity in the right sATL in anxiety disorders (Wang et al., 2022). ...
Preprint
Full-text available
Maladaptive forms of guilt, such as excessive self-blame, are common characteristics of anxiety disorders. The associated network includes the superior anterior temporal lobe (sATL), underlying the conceptual representations of social meaning, and fronto-subcortical areas involved in the affective dimension of guilt. Nevertheless, despite understanding the anatomy of the guilt processing circuitry, network-level changes related to subclinical anxiety and self-blaming behaviour have not been depicted. To fill this gap, we used graph theory analyses on a resting-state functional and diffusion-weighted magnetic resonance imaging dataset of 78 healthy adults. Within the guilt network, we found increased functional contributions (higher clustering coefficient, local efficiency and strength) of the left sATL for individuals with higher self-blaming and trait-anxiety, while functional isolation (lower clustering coefficient and local efficiency) of the left pars opercularis and insula was related to higher trait-anxiety. Trait-anxiety was also linked to the structural network’s global parameters (mean clustering coefficient), with the circuitry’s architecture favouring increased local information processing in individuals with increased anxiety levels. Previous research suggests that aberrant interactions between conceptual (sATL) and affective (fronto-limbic) regions underlie maladaptive guilt and the current results align and expand on this theory by detailing network changes associated with self-blame and trait-anxiety.
... Standard training objectives do not explicitly constrain the global structure of representations; nevertheless, these objectives yield representations that capture some aspects of the higher-order category structure [e.g., 40] and neural unit activity [e.g., 92,80] of human and animal representations of the same images. Some models progressively differentiate hierarchical structure over the course of learning [5] in a similar way to how humans learn semantic features [73,18,78,5,79]. Even so, learned representations still fail to capture important aspects of the structure that humans learn [6]. ...
... Furthermore, food and drink are one of the few pairs of superordinate categories between which distances actually decrease after the transform, presumably reflecting the strong semantic ties between these categories. Similarly, animals move less far from plants than from any other category, perhaps reflecting the fact that the animate/inanimate distinction is one of the strongest features in human semantic representations [73]. ...
... The ATL has been described as a hub region that interconnects modality-specific regions to obtain semantic (including social) information. The more frequently semantic items are processed, the stronger the connections become, meaning that less activity is required in the network to obtain the information (Lambon Ralph et al. 2016;McClelland 2004). As such, our findings align with the idea that the sATL 'communicates' more effectively with other semantic-related regions in individuals with higher selfblame. ...
Article
Maladaptive forms of guilt, such as excessive self-blame, are common characteristics of anxiety and depressive disorders. The underlying network consists of multiple associative areas, including the superior anterior temporal lobe (sATL), underlying the conceptual representations of social meaning, and fronto-subcortical areas involved in the affective dimension of guilt. Nevertheless, despite understanding the circuitry’s anatomy, network-level changes related to subclinical anxiety and self-blaming behaviour have not been depicted. To fill this gap, we used graph theory analyses on a resting-state functional and diffusion-weighted magnetic resonance imaging dataset of 78 healthy adults (20 females, 20–35 years old). Within the guilt network, we found increased functional contributions of the left sATL for individuals with higher self-blaming, while functional isolation of the left pars opercularis and insula was related to higher trait anxiety. Trait anxiety was also linked to the structural network’s mean clustering coefficient, with the circuitry’s architecture favouring increased local information processing in individuals with increased anxiety levels, however, only when a highly specific subset of connections was considered. Previous research suggests that aberrant interactions between conceptual (sATL) and affective (fronto-limbic) regions underlie maladaptive guilt, and the current results align and expand on this theory by detailing network changes associated with self-blame and trait anxiety.
... As co-activated neurons were present in all network areas, the resultant CAs were distributed across network areas. Our results contrast with earlier simulations using layered networks, which arguably do not build distinct and discrete sets of neurons related to symbols, word forms, or meanings, while "represent" cognitive entities such as word forms or meanings as fully distributed dynamic activation patterns instead (Farah and McClelland 1991;Elman 1996Elman , 2004Rogers and McClelland 2004;Westermann et al. 2006;Ralph et al. 2017). Although distributed dynamic activation patterns can be informative to some extent, the use of more biologically well-founded models of the brain that include reciprocal topographical links between areas, excitatory within-area connections, and Hebbian correlation-based learning entails discrete circuit formation, offering a more accurate model of brain-based neural circuit formation (see (Garagnani et al. 2009;Pulvermüller 2023)). ...
Article
Full-text available
The ability of humans to store spoken words in verbal working memory and build extensive vocabularies is believed to stem from evolutionary changes in cortical connectivity across primate species. However, the underlying neurobiological mechanisms remain unclear. Why can humans acquire vast vocabularies, while non-human primates cannot? This study addresses this question using brain-constrained neural networks that realize between-species differences in cortical connectivity. It investigates how these structural differences support the formation of neural representations for spoken words and the emergence of verbal working memory, crucial for human vocabulary building. We develop comparative models of frontotemporal and occipital cortices, reflecting human and non-human primate neuroanatomy. Using meanfield and spiking neural networks, we simulate auditory word recognition and examine verbal working memory function. The “human models”, characterized by denser inter-area connectivity in core language areas, produced larger cell assemblies than the “monkey models”, with specific topographies reflecting semantic properties of the represented words. Crucially, longer-lasting reverberant neural activity was observed in human versus monkey architectures, compatible with robust verbal working memory, a necessary condition for vocabulary building. Our findings offer insights into the structural basis of human-specific symbol learning and verbal working memory, shedding light on humans’ unique capacity for large vocabulary acquisition.
... It is worth noting that there was nothing intrinsic to the training examples used here that distinguished objects with animate features from the objects with inanimate features. As is true of simulations like this one (e.g., Rogers & McClelland, 2004), an object's "meaning" is determined by the associative relations into which it enters rather than by something abstract about the object. In terms of the present simulation, the networks' responses at test were due to learned associations between particular visible, low-level, surface features (available to any low-level parsing routine) and particular low-level, perceptual-based, kinematic depictions of different kinds of causal action. ...
Article
Full-text available
Considerable research shows that causal perception emerges between 6 and 10 months of age. Yet, because this research tends to use artificial stimuli, it is unanswered how or through what mechanisms of change human infants learn about the causal properties of real-world categories such as animate entities and inanimate objects. One answer to this question is that this knowledge is innate (i.e., unlearned, evolutionarily ancient, and possibly present at birth) and underpinned by core knowledge and core cognition. An alternative perspective that is tested here through computer simulations is that infants acquire this knowledge via domain-general associative learning. This article demonstrates that associative learning alone—as instantiated in an artificial neural network—is sufficient to explain the data presented in four classic infancy studies: Spelke et al. (1995), Saxe et al. (2005), Saxe et al. (2007), and Markson and Spelke (2006). This work not only advances theoretical perspectives within developmental psychology but also has implications for the design of artificial intelligence systems inspired by human cognitive development.
... Indeed, models have proposed the existence of a pan-modal integrative semantic region in the brain, which draws upon input from modality-specific knowledge brain regions to form deeper, transmodal representations . To simulate damage to this region, models have removed connections, showing impairments to activating associated information for concepts (Rogers et al., 2004). This is in line with multimodal deficits observed in patients with semantic dementia . ...
Article
Full-text available
Working memory is the system that supports the temporary storage and processing of information. It is generally agreed that working memory is a mental workspace, with a combination of resources operating together to maintain information in mind for potential use in thought and action. Theories typically acknowledge contributions of long-term memory to this system. One particular aspect of long-term memory, namely semantic long-term memory, can effectively supplement or ‘boost’ working memory performance. This may be a relatively automatic process via the semantic properties of the stimuli or more active via strategy development and implementation. However, the precise mechanisms require greater theoretical understanding. In this review of the literature, we critically discuss theoretical models of working memory and their proposed links with long-term memory. We also explore empirical research that contributes to our understanding of the ways in which semantics can support performance on both verbal and visuospatial working memory tasks, with a view to potential intervention development. This includes the possibility of training people with lower performance (e.g., older adults) to use semantics during working memory tasks. We conclude that semantics may offer an opportunity to maximise working memory performance. However, to realise this potential, more research is needed, particularly in the visuospatial domain.
... The rapid development of deep learning in the past decade has inspired lots of extraordinary applications and achieved remarkable success in various fields, from machine vision [1], speech recognition [2], natural language processing [3], reinforcement learning [4], to modeling animals and humans in neuroscience [5,6], psychology [7,8] and education [9]. Despite the increasing prevalence of applications employing deep learning, comprehension of the underlying mechanisms driving its exceptional performance remains limited. ...
Preprint
Full-text available
In the past decade, significant strides in deep learning have led to numerous groundbreaking applications. Despite these advancements, the understanding of the high generalizability of deep learning, especially in such an over-parametrized space, remains limited. Successful applications are often considered as empirical rather than scientific achievements. For instance, deep neural networks' (DNNs) internal representations, decision-making mechanism, absence of overfitting in an over-parametrized space, high generalizability, etc., remain less understood. This paper delves into the loss landscape of DNNs through the lens of spin glass in statistical physics, i.e. a system characterized by a complex energy landscape with numerous metastable states, to better understand how DNNs work. We investigated a single hidden layer Rectified Linear Unit (ReLU) neural network model, and introduced several protocols to examine the analogy between DNNs (trained with datasets including MNIST and CIFAR10) and spin glass. Specifically, we used (1) random walk in the parameter space of DNNs to unravel the structures in their loss landscape; (2) a permutation-interpolation protocol to study the connection between copies of identical regions in the loss landscape due to the permutation symmetry in the hidden layers; (3) hierarchical clustering to reveal the hierarchy among trained solutions of DNNs, reminiscent of the so-called Replica Symmetry Breaking (RSB) phenomenon (i.e. the Parisi solution) in analogy to spin glass; (4) finally, we examine the relationship between the degree of the ruggedness of the loss landscape of the DNN and its generalizability, showing an improvement of flattened minima.
... First, existing models provide accounts of how adjective and noun information combines in a mature semantic system but are largely silent on how this ability is acquired. As neural networks learn to perform tasks incrementally through training, they provide an opportunity to investigate how representations emerge and what developmental stages are involved [17,18]. Second, unlike the Bayesian model proposed by Solomon and Thompson-Schill, our simulations included no notion of feature uncertainty. ...
Article
Full-text available
Our ability to combine simple constituents into more complex conceptual combinations is a fundamental aspect of cognition. Gradable adjectives (e.g., ‘tall’ and ‘light’) are a critical example of this process, as their meanings vary depending on the noun with which they are combined. For example, a dark diamond is less dark than dark charcoal. Here, we investigate how a neural network encodes the flexible nature of gradable adjectives in adjective–noun pairs, using the perceptual feature of brightness as a test case. We trained a neural network to predict human brightness ratings for unmodified nouns and adjective–noun pairs and assessed its ability to generalize to untrained combinations (e.g., ‘light paint’ vs. ‘dark paint’). We also explored how this information is encoded. We found that flexible learning of gradable adjectives was possible, with neural networks first making predictions based on the adjective alone, and then modulating these with information from the noun later in learning. We also found that model outputs mimicked the kind of non-additive feature modulation present in human data. Our results have implications for understanding how semantic composition occurs and generate testable predictions for future work.
... This process is an essential part of human cognition in tasks ranging from everyday communication to problem-solving. In this cognitive process, our mental representations serve as a substrate, aiding in the recognition of objects 1, 2 , formation of categories [3][4][5] , organization of conceptual knowledge 6,7 , and the prediction of behaviors based on experiences. Therefore, understanding the structure of these representations is a fundamental pursuit in cognitive neuroscience and psychology [8][9][10] , underpinning significant research advancements in the field. ...
Preprint
Full-text available
The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vast amounts of linguistic and multimodal data. In this study, we combined behavioral and neuroimaging analysis methods to uncover how the object concept representations in LLMs correlate with those of humans. By collecting large-scale datasets of 4.7 million triplet judgments from LLM and Multimodal LLM (MLLM), we were able to derive low-dimensional embeddings that capture the underlying similarity structure of 1,854 natural objects. The resulting 66-dimensional embeddings were found to be highly stable and predictive, and exhibited semantic clustering akin to human mental representations. Interestingly, the interpretability of the dimensions underlying these embeddings suggests that LLM and MLLM have developed human-like conceptual representations of natural objects. Further analysis demonstrated strong alignment between the identified model embeddings and neural activity patterns in many functionally defined brain ROIs (e.g., EBA, PPA, RSC and FFA). This provides compelling evidence that the object representations in LLMs, while not identical to those in the human, share fundamental commonalities that reflect key schemas of human conceptual knowledge. This study advances our understanding of machine intelligence and informs the development of more human-like artificial cognitive systems.
... Relationships among concepts are generally organized either taxonomically or thematically. Taxonomic organization relies on featural similarity and category membership (e.g., whale and dog are both animals; Rogers & Mcclelland, 2004), and thematic organization relies on mutual involvement in events (e.g., whale and binoculars are both involved in whale watching, or so you hope when you spend a lot of money on the tour; Estes et al., 2011). Although debate exists regarding whether these two organization schemes exist as separate semantic memory systems, the debate is outside the focus of this chapter (interested readers are directed to Mirman et al., 2017;Thompson et al., 2017). ...
Chapter
Long-term memory (LTM) encompasses a broad range of learning and memory abilities supported by diverse brain regions. According to the Multiple Memory Systems framework (Squire, 2004), declarative LTM is supported by medial temporal and frontal lobe structures, including the diencephalon. Although declarative LTM is typically divided into episodic and semantic memories, with no further divisions within those categories, both memories are themselves quite diverse, capable of producing memorial experiences ranging from strong to weak and vivid to vague. This chapter reviews evidence about the diversity of LTM phenomena and argues that this diversity is due to reciprocal connections between neural systems supporting these memory types and the locus coeruleus–norepinephrine (LC–NE) system. Specifically, because the LC–NE plays a role in neural gain modulation, it may be a key structure involved in producing the diverse memories that characterize declarative LTM. Key findings from memory studies using pupil size as an indirect index of LC–NE activity are reviewed.
... The associations between novel words and existing concepts can be built through a variety of semantic relations, mainly including thematic relations and taxonomic relations (Mirman and Graziano, 2012), which are two important semantic components stored in the semantic memory (Borghi and Caramelli, 2003;Estes et al., 2011;Yee et al., 2018). Taxonomic relation organizes concepts based on the similarity of features or properties among concepts (Rogers and McClelland, 2004). For example, cow and sheep are connected through a taxonomic relation. ...
Article
Full-text available
This study aims to examine the process of L2 novel word learning through the combination of episodic and semantic memory, and how the process differs between the formation of thematic and taxonomic relations. The major approach adopted was observing the neural effects of word learning, which is manifested in the N400 from event-related potentials (ERPs). Eighty-eight participants were recruited for the experiment. In the learning session, L2 contextual discourses related to novel words were learned by participants. In the testing session, discourses embedded with incongruous and congruous novel words in the final position were used for participants to judge the congruency which affected the N400 neural activity. The results showed that both recurrent and new-theme discourses elicited significant N400 effects, while taxonomic sentences did not. These results confirmed the formation of episodic and semantic memory during L2 new word learning, in which semantic memory was mainly supported by thematic relations.
... Your cricket concept is malleable enough to absorb these new data without your forgetting everything you previously knew about crickets. Concepts can be substantially modified as new information is learned, as in this example, but can also undergo subtle shifts in response to a constantly changing environment (Rogers & McClelland, 2004;Musz & Thompson-Schill, 2015). ...
Preprint
Full-text available
Our representations of the world need to be stable enough to support general knowledge but flexible enough to incorporate new information as our environment changes. How does the human brain manage this stability-plasticity trade-off? We analyzed a large dataset in which participants viewed objects embedded in thousands of natural scenes across many fMRI sessions. Semantic item representations were located by jointly leveraging a voxelwise encoding model to find reliable item representations and a word-embedding model to evaluate semantic content. Within the medial temporal lobe, semantic item representations in hippocampal subfield CA1, parahippocampal cortex, and perirhinal cortex gradually drifted across a period of multiple months. However, rapid plasticity was observed only in parahippocampal cortex, such that item co-occurrence statistics warped item representations within a single session. In conjunction with whole-brain analyses, these results suggest that the brain solves the stability-plasticity trade-off by promoting plasticity in only a subset of semantic regions.
... As the junction of the ventral pathway with other processing streams, vATL is thought to act as a transmodal semantic hub that combines visual features with multimodal information sources to generate conceptual representations (for a review, see Lambon Ralph et al. 2017). The ATLs are strongly associated with integrating object features across sensory modalities (Rogers and McClelland 2004;Coutanche and Thompson-Schill 2015), and are engaged in semantic processing irrespective of input modality (e.g. words, pictures and sounds) (Vandenberghe et al. 1996;Marinkovic et al. 2003;Binney et al. 2010;Visser and Lambon Ralph 2011) and across a range of conceptual categories Rice et al. 2018;Wang et al. 2019;Conca et al. 2021). ...
Article
Full-text available
Semantic knowledge includes understanding of objects and their features and also understanding of the characteristics of events. The hub-and-spoke theory holds that these conceptual representations rely on multiple information sources that are integrated in a central hub in the ventral anterior temporal lobes. The dual-hub theory expands this framework with the claim that the ventral anterior temporal lobe hub is specialized for object representation, while a second hub in angular gyrus is specialized for event representation. To test these ideas, we used representational similarity analysis, univariate and psychophysiological interaction analyses of fMRI data collected while participants processed object and event concepts (e.g. “an apple,” “a wedding”) presented as images and written words. Representational similarity analysis showed that angular gyrus encoded event concept similarity more than object similarity, although the left angular gyrus also encoded object similarity. Bilateral ventral anterior temporal lobes encoded both object and event concept structure, and left ventral anterior temporal lobe exhibited stronger coding for events. Psychophysiological interaction analysis revealed greater connectivity between left ventral anterior temporal lobe and right pMTG, and between right angular gyrus and bilateral ITG and middle occipital gyrus, for event concepts compared to object concepts. These findings support the specialization of angular gyrus for event semantics, though with some involvement in object coding, but do not support ventral anterior temporal lobe specialization for object concepts.
... Alternatively, if the inferred LCs are represented in a feature space, where distances reflect task dissimilarity, then LCI can be cast as a continuous optimization problem. This idea dates back to the classic parallel distributed processing models of semantic cognition, where activation space gradient descent was used to find hidden representations of objects 85 . Recent results have demonstrated that this optimization problem can be addressed by performing a gradient-based search for context patterns to support context-dependent behavior 86,87 . ...
Preprint
Full-text available
It has been proposed that, when processing a stream of events, humans divide their experiences in terms of inferred latent causes (LCs) to support context-dependent learning. However, when shared structure is present across contexts, it is still unclear how the "splitting" of LCs and learning of shared structure can be simultaneously achieved. Here, we present the Latent Cause Network (LCNet), a neural network model of LC inference. Through learning, it naturally stores structure that is shared across tasks in the network weights. Additionally, it represents context-specific structure using a context module, controlled by a Bayesian nonparametric inference algorithm, which assigns a unique context vector for each inferred LC. Across three simulations, we found that LCNet could 1) extract shared structure across LCs in a function learning task while avoiding catastrophic interference, 2) capture human data on curriculum effects in schema learning, and 3) infer the underlying event structure when processing naturalistic videos of daily events. Overall, these results demonstrate a computationally feasible approach to reconciling shared structure and context-specific structure in a model of LCs that is scalable from laboratory experiment settings to naturalistic settings.
... Within the field of linguistics, the term "word association" (WA) refers to the mental connection between two or more words that are perceived to have semantic relatedness. The ability to form such associations is crucial in language acquisition (Elman et al. 1996;Rogers and McClelland 2004). Word associations serve as a direct means of understanding an individual's semantic knowledge (Nelson et al. 2004;Mollin 2009) and hold significance in comprehending human thought processes more broadly (Deese 1965). ...
Article
Full-text available
The COVID-19 pandemic has had a significant impact on various aspects of society, including language and cognitive processes. This research investigates how the pandemic has influenced associations related to health-related words among 1,454 Estonian native speakers. Data collected between January and March 2023 were compared with a pre-pandemic dataset, the Dictionary of Estonian Word Associations (DEWA), compiled from 2016 to 2018. The study focuses on fifteen health-related cue words. The results revealed that five terms experienced significant changes in their association sequences concerning the COVID-19 crisis. Notably, among these 15 words, three stand out as the most significant cases where a change occurred in their primary responses: these typically exhibit the most robust and enduring associative links, making them less susceptible to change. This unveils shifts in the mental lexicon's representations and the evolving perceptions of specific words and concepts amidst the pandemic backdrop. These findings illustrate how unforeseen external disruptions, such as the COVID-19 crisis, can reconfigure the salience of certain concepts within language and cognition. This research contributes to our comprehension of the linguistic repercussions and potential language adaptations triggered by a health crisis. It also enriches the relatively understudied field of word association research, particularly in languages beyond the dominion of English.
Article
Full-text available
On the basis of a theory about the role of semantic knowledge in the recognition and production of familiar words and objects, we predicted that patients with semantic dementia would reveal a specific pattern of impairment on six different tasks typically considered “pre-” or “non-” semantic: reading aloud, writing to dictation, inflecting verbs, lexical decision, object decision, and delayed copy drawing. The prediction was that all tasks would reveal a frequency-by-typicality interaction, with patients performing especially poorly on lower-frequency items with atypical structure (e.g., words with an atypical spelling-to-sound relationship; objects with an atypical feature for their class, such as the hump on a camel, etc). Of 84 critical observations (14 patients performing 6 tasks), this prediction was correct in 84/84 cases; and a single component in a factor analysis accounted for 87% of the variance across seven measures: each patient's degree of impairment on atypical items in the six experimental tasks and a separate composite score reflecting his or her degree of semantic impairment. Errors also consistently conformed to the predicted pattern for both expressive and receptive tasks, with responses reflecting residual knowledge about the typical surface structure of each domain. We argue that these results cannot be explained as associated but unrelated deficits but instead are a principled consequence of a primary semantic impairment.
Article
People are generally more accurate at categorizing objects at the basic level (e.g., dog ) than at more general, superordinate categories (e.g., animal ). Recent research has suggested that this basic‐level advantage emerges from the linguistic‐distributional and sensorimotor relationship between a category concept and object concept, but the proposed mechanisms have not been subject to a formal computational test. In this paper, we present a computational model of category verification that allows linguistic distributional information and sensorimotor experience to interact in a grounded implementation of a full‐size adult conceptual system. In simulations across multiple datasets, we demonstrate that the model performs the task of category verification at a level comparable to human participants, and—critically—that its operation naturally gives rise to the basic‐level‐advantage phenomenon. That is, concepts are easier to categorize when there is a high degree of overlap in sensorimotor experience and/or linguistic distributional knowledge between category and member concepts, and the basic‐level advantage emerges as an overall behavioral artifact of this linguistic and sensorimotor overlap. Findings support the linguistic–sensorimotor preparation account of the basic‐level advantage and, more broadly, linguistic–sensorimotor theories of the conceptual system.
Article
When we use language to communicate, we must choose what to say, what not to say, and how to say it. That is, we must decide how to frame the message. These linguistic choices matter: Framing a discussion one way or another can influence how people think, feel, and act in many important domains, including politics, health, business, journalism, law, and even conversations with loved ones. The ubiquity of framing effects raises several important questions relevant to the public interest: What makes certain messages so potent and others so ineffectual? Do framing effects pose a threat to our autonomy, or are they a rational response to variation in linguistic content? Can we learn to use language more effectively to promote policy reforms or other causes we believe in, or is this an overly idealistic goal? In this article, we address these questions by providing an integrative review of the psychology of framing. We begin with a brief history of the concept of framing and a survey of common framing effects. We then outline the cognitive, social-pragmatic, and emotional mechanisms underlying such effects. This discussion centers on the view that framing is a natural—and unavoidable—feature of human communication. From this perspective, framing effects reflect a sensible response to messages that communicate different information. In the second half of the article, we provide a taxonomy of linguistic framing techniques, describing various ways that the structure or content of a message can be altered to shape people’s mental models of what is being described. Some framing manipulations are subtle, involving a slight shift in grammar or wording. Others are more overt, involving wholesale changes to a message. Finally, we consider factors that moderate the impact of framing, gaps in the current empirical literature, and opportunities for future research. We conclude by offering general recommendations for effective framing and reflecting on the place of framing in society. Linguistic framing is powerful, but its effects are not inevitable—we can always reframe an issue to ourselves or other people.
Article
Memory systems constantly confront the challenge of capturing both the shared features that connect experiences together and the unique features that distinguish them. Across two experiments, we leveraged a color memory distortion paradigm to investigate how we handle this representational tension when learning new information. Over a thirty-minute period, participants learned shared and unique features of categories of novel objects, where each feature was assigned a particular color. While participants did not differ in how accurately they remembered these features overall, when inaccurate, participants misremembered the color of shared (relative to unique) features as more similar to the category’s average color, suggesting more integration of shared features in memory. This same rapid representational warping manifested in a neural network model trained on the same categories. The work reveals how memories for different features are rapidly and differentially warped as a function of their roles in a category.
Article
Full-text available
Purpose The purpose of this viewpoint was to advocate for increased study of semantic memory ability in traumatic brain injury (TBI). Method We review modern conceptualizations of semantic memory and its neural correlates and discuss how common neuroanatomical and cognitive deficits in TBI place this population at an increased risk for semantic disruption. Building on discussions at the 2024 International Cognitive-Communication Disorders Conference, we offer possible explanations for how these disruptions may have been overlooked by our field and offer examples of how semantic memory has been studied in other populations as well as how this work may apply to TBI research. Result Semantic memory is critical for academic, vocational, and interpersonal outcomes. Yet, little is known about semantic memory in TBI beyond naming ability. By examining only surface forms of semantic memory, we may be missing a deeper disruption in semantic structure. Conclusion More in-depth examination of semantic memory promises to uncover underlying mechanisms of cognitive-communication disorders and new opportunities to develop more sensitive clinical measures of semantic memory impairment.
Article
How can modern neural networks like language models be useful to the field of language acquisition, and more broadly cognitive science, if they are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT‐4, the question of how they can inform our understanding of human language acquisition has re‐emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modelling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist versus instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.
Article
We propose a simple computational model that describes potential mechanisms underlying the organization and development of the lexical‐semantic system in 18‐month‐old infants. We focus on two independent aspects: (i) on potential mechanisms underlying the development of taxonomic and associative priming, and (ii) on potential mechanisms underlying the effect of Inter Stimulus Interval on these priming effects. Our model explains taxonomic priming between words by semantic feature overlap , whereas associative priming between words is explained by Hebbian links between semantic representations derived from co‐occurrence relations between words (or their referents). From a developmental perspective, any delay in the emergence of taxonomic priming compared to associative priming during infancy seems paradoxical since feature overlap per se need not be learned. We address this paradox in the model by showing that feature overlap itself is an emergent process. The model successfully replicates infant data related to Inter Stimulus Interval effects in priming experiments and makes testable predictions.
Article
The success of methods based on artificial neural networks in creating intelligent machines seems like it might pose a challenge to explanations of human cognition in terms of Bayesian inference. We argue that this is not the case and that these systems in fact offer new opportunities for Bayesian modeling. Specifically, we argue that artificial neural networks and Bayesian models of cognition lie at different levels of analysis and are complementary modeling approaches, together offering a way to understand human cognition that spans these levels. We also argue that the same perspective can be applied to intelligent machines, in which a Bayesian approach may be uniquely valuable in understanding the behavior of large, opaque artificial neural networks that are trained on proprietary data.
Article
Working memory is the fundamental function of the various cognitive processes and abilities in the overall trajectory of development. Significant advances in multivariate analysis of human functional magnetic resonance imaging data have converged functional segregation models toward integrated representation-based models. However, due to the inherent limitations of the multi-voxel pattern analysis method, we are unable to determine whether the underlying neural representations are spatially similar in the brain. Our study attempts to answer this question by examining the spatial similarity of brain activity during the working memory task in children and adults. Our results reveal similar patterns of activity between the regions involved in working memory. This functional network of similar spatial patterns was observed in both normally developing children and adults. However, the between-region similarity was more pronounced in adults than in children and associated with better performance. We propose an exchange of similar information flows through the brain at an integrated level of working memory processes, underpinning the holistic nature of working memory representation.
Article
Full-text available
Uncovering the mechanisms of physics is driving a new paradigm in artificial intelligence (AI) discovery. Today, physics has enabled us to understand the AI paradigm in a wide range of matter, energy, and space-time scales through data, knowledge, priors, and laws. At the same time, the AI paradigm also draws on and introduces the knowledge and laws of physics to promote its own development. Then this new paradigm of using physical science to inspire AI is the physical science of artificial intelligence (PhysicsScience4AI, PS4AI). Although AI has become the driving force for development in various fields, there is still a “black box” phenomenon that is difficult to explain in the field of AI deep learning. This article will briefly review the connection between relevant physics disciplines (classical mechanics, electromagnetism, statistical physics, quantum mechanics) and AI. It will focus on discussing the mechanisms of physics disciplines and how they inspire the AI deep learning paradigm, and briefly introduce some related work on how AI solves physics problems. PS4AI is a new research field. At the end of the article, we summarize the challenges facing the new physics-inspired AI paradigm and look forward to the next generation of artificial intelligence technology. This article aims to provide a brief review of research related to physics-inspired AI deep algorithms and to stimulate future research and exploration by elucidating recent advances in physics.
Article
Full-text available
Understanding the mechanisms enabling the learning and flexible use of knowledge in context-appropriate ways has been a major focus of research in the study of both semantic cognition and cognitive control. We present a unified model of semantics and control that addresses these questions from both perspectives. The model provides a coherent view of how semantic knowledge, and the ability to flexibly access and deploy that knowledge to meet current task demands, arises from end-to-end learning of the statistics of the environment. We show that the model addresses unresolved issues from both literatures, including how control operates over features that covary with one another and how control representations themselves are structured and emerge through learning, through a series of behavioral experiments and simulations. We conclude by discussing the implications of our approach to other fundamental questions in cognitive science, machine learning, and artificial intelligence.
Article
Full-text available
It has been proposed that, when processing a stream of events, humans divide their experiences in terms of inferred latent causes (LCs) to support context-dependent learning. However, when shared structure is present across contexts, it is still unclear how the “splitting” of LCs and learning of shared structure can be simultaneously achieved. Here, we present the Latent Cause Network (LCNet), a neural network model of LC inference. Through learning, it naturally stores structure that is shared across tasks in the network weights. Additionally, it represents context-specific structure using a context module, controlled by a Bayesian nonparametric inference algorithm, which assigns a unique context vector for each inferred LC. Across three simulations, we found that LCNet could (1) extract shared structure across LCs in a function learning task while avoiding catastrophic interference, (2) capture human data on curriculum effects in schema learning, and (3) infer the underlying event structure when processing naturalistic videos of daily events. Overall, these results demonstrate a computationally feasible approach to reconciling shared structure and context-specific structure in a model of LCs that is scalable from laboratory experiment settings to naturalistic settings.
Article
Full-text available
Hierarchical predictive processing provides a framework outlining how prior expectations shape perception and cognition. Here, we highlight hierarchical predictive processing as a framework for explaining how social context and group-based social knowledge can directly shape intergroup perception. More specifically, we argue that hierarchical predictive processing confers a uniquely valuable toolset to explain extant findings and generate novel hypotheses for intergroup perception. We first provide an overview of hierarchical predictive processing, specifying its primary theoretical assumptions. We then review evidence showing how prior knowledge influences intergroup perception. Next, we outline how hierarchical predictive processing can account well for findings in the intergroup perception literature. We then underscore the theoretical strengths of hierarchical predictive processing compared to other frameworks in this space. We finish by outlining future directions and laying out hypotheses that test the implications of hierarchical predictive processing for intergroup perception and intergroup cognition more broadly. Taken together, hierarchical predictive processing provides explanatory value and capacity for novel hypothesis generation for intergroup perception.
Article
A view that has been gaining prevalence over the past decade is that the human conceptual system is malleable, dynamic, context‐dependent, and task‐dependent, that is, flexible. Within the flexible conceptual representation framework, conceptual representations are constructed ad hoc, forming a different, idiosyncratic instantiation upon each occurrence. In this review, we scrutinize the neurocognitive literature to better understand the nature of this flexibility. First, we identify some key characteristics of these representations. Next, we consider how these flexible representations are constructed by addressing some of the open questions in this framework: We review the age‐old question of how to reconcile flexibility with the apparent need for shareable stable definitions to anchor meaning and come to mutual understanding, as well as some newer questions we find critical, namely, the nature of relations among flexible representations, the role of feature saliency in activation, and the viability of all‐or‐none feature activations. We suggest replacing the debate about the existence of a definitional stable core that is obligatorily activated with a question of the degree and probability of activation of the information constituting a conceptual representation. We rely on published works to suggest that (1) prior featural salience matters, (2) feature activation may be graded, and (3) Bayesian updating of prior information according to current demands offers a viable account of how flexible representations are constructed. This proposal provides a theoretical mechanism for incorporating a changing momentary context into a constructed representation, while still preserving some of the concept's constituent meaning.
Article
The collocation frequency of words in the language environment contributes to early vocabulary development. Vocabulary size, in turn, predicts children's reading comprehension skills later in development. Both collocation frequency and reading comprehension have been connected to inferential reasoning at different time points in development. Here, it was hypothesized that 8‐year‐old children's ( N = 147; 76 female) sensitivity to collocation frequency would be related to vocabulary size and reading comprehension skills of varying complexity. Participants completed an auditory thematic judgment task to assess their sensitivity to collocation frequency (response accuracy or speed). In the task, children were presented with a short sentence containing a reference word (e.g., “John sees the cloud.”) and asked to judge which of two subsequent words best fit the sentence (e.g., “rain” or “lip”). Semantic relatedness between reference words and test words was operationalized in three levels (strong, weak, and distant) based on a corpus‐based analysis of collocation frequency. Multilevel and mediation analyses confirmed that thematic judgment responses were related to corpus‐based measures of collocation frequency and were associated with vocabulary size and reading comprehension skills at the sentence and text level. Furthermore, thematic judgment predicted vocabulary size and reading comprehension when the relation of decoding and reading comprehension was taken into account. The study highlights sensitivity to collocation frequency as a link between early language comprehension development (i.e., lexical retrieval and inferential reasoning) and reading comprehension in middle childhood. It also integrates theoretical approaches from computational network or distributional semantics studies and behavioral experimental studies.
Article
Concrete symbols (e.g., sun , run ) can be learned in the context of objects and actions, thereby grounding their meaning in the world. However, it is controversial whether a comparable avenue to semantic learning exists for abstract symbols (e.g., democracy ). When we simulated the putative brain mechanisms of conceptual/semantic grounding using brain‐constrained deep neural networks, the learning of instances of concrete concepts outside of language contexts led to robust neural circuits generating substantial and prolonged activations. In contrast, the learning of instances of abstract concepts yielded much reduced and only short‐lived activity. Crucially, when conceptual instances were learned in the context of wordforms, circuit activations became robust and long‐lasting for both concrete and abstract meanings. These results indicate that, although the neural correlates of concrete conceptual representations can be built from grounding experiences alone, abstract concept formation at the neurobiological level is enabled by and requires the correlated presence of linguistic forms.
Article
Early in life and without special training, human beings discern resemblance between abstract visual stimuli, such as drawings, and the real-world objects they represent. We used this capacity for visual abstraction as a tool for evaluating deep neural networks (DNNs) as models of human visual perception. Contrasting five contemporary DNNs, we evaluated how well each explains human similarity judgments among line drawings of recognizable and novel objects. For object sketches, human judgments were dominated by semantic category information; DNN representations contributed little additional information. In contrast, such features explained significant unique variance perceived similarity of abstract drawings. In both cases, a vision transformer trained to blend representations of images and their natural language descriptions showed the greatest ability to explain human perceptual similarity—an observation consistent with contemporary views of semantic representation and processing in the human mind and brain. Together, the results suggest that the building blocks of visual similarity may arise within systems that learn to use visual information, not for specific classification, but in service of generating semantic representations of objects.
Article
Human cognition is unique in its ability to perform a wide range of tasks and to learn new tasks quickly. Both abilities have long been associated with the acquisition of knowledge that can generalize across tasks and the flexible use of that knowledge to execute goal-directed behavior. We investigate how this emerges in a neural network by describing and testing the Episodic Generalization and Optimization (EGO) framework. The framework consists of an episodic memory module, which rapidly learns relationships between stimuli; a semantic pathway, which more slowly learns how stimuli map to responses; and a recurrent context module, which maintains a representation of task-relevant context information, integrates this over time, and uses it both to recall context-relevant memories (in episodic memory) and to bias processing in favor of context-relevant features and responses (in the semantic pathway). We use the framework to address empirical phenomena across reinforcement learning, event segmentation, and category learning, showing in simulations that the same set of underlying mechanisms accounts for human performance in all three domains. The results demonstrate how the components of the EGO framework can efficiently learn knowledge that can be flexibly generalized across tasks, furthering our understanding of how humans can quickly learn how to perform a wide range of tasks—a capability that is fundamental to human intelligence.
Chapter
Thinking and reasoning, long the academic province of philosophy, have emerged over the past century as core topics of empirical investigation and theoretical analysis in the modern fields of cognitive psychology, cognitive science, and cognitive neuroscience. Formerly seen as too complicated and amorphous to be included in early textbooks on the science of cognition, the study of thinking and reasoning has since taken off, branching off in a distinct direction from the field from which it originated. This comprehensive publication covers all the core topics of the field of thinking and reasoning. Written by the foremost experts from cognitive psychology, cognitive science, and cognitive neuroscience, individual articles summarize basic concepts and findings for a major topic, sketch its history, and give a sense of the directions in which research is currently heading. The authors provide introductions to foundational issues and methods of study in the field, as well as treatment of specific types of thinking and reasoning and their application in a broad range of fields including business, education, law, medicine, music, and science.
Chapter
Cognitive science is a cross-disciplinary enterprise devoted to understanding the nature of the mind. In recent years, investigators in philosophy, psychology, the neurosciences, artificial intelligence, and a host of other disciplines have come to appreciate how much they can learn from one another about the various dimensions of cognition. The result has been the emergence of one of the most exciting and fruitful areas of inter-disciplinary research in the history of science. This volume of original essays surveys foundational, theoretical, and philosophical issues across the discipline, and introduces the foundations of cognitive science, the principal areas of research, and the major research programs. With a focus on broad philosophical themes rather than detailed technical issues, the volume will be valuable not only to cognitive scientists and philosophers of cognitive science, but also to those in other disciplines looking for an authoritative and up-to-date introduction to the field.
Article
Full-text available
Neurocognitive models of semantic memory have proposed that the ventral anterior temporal lobes (vATLs) encode a graded and multidimensional semantic space—yet neuroimaging studies seeking brain regions that encode semantic structure rarely identify these areas. In simulations, we show that this discrepancy may arise from a crucial mismatch between theory and analysis approach. Utilizing an analysis recently formulated to investigate graded multidimensional representations, representational similarity learning (RSL), we decoded semantic structure from ECoG data collected from the vATL cortical surface while participants named line drawings of common items. The results reveal a graded, multidimensional semantic space encoded in neural activity across the vATL, which evolves over time and simultaneously expresses both broad and finer-grained semantic structure among animate and inanimate concepts. The work resolves the apparent discrepancy within the semantic cognition literature and, more importantly, suggests a new approach to discovering representational structure in neural data more generally.
Article
Semantic feature production norms have several desirable characteristics that have supported models of representation and processing in adults. However, several key challenges have limited the use of semantic feature norms in studies of early language acquisition. First, existing norms provide uneven and inconsistent coverage of early-acquired concepts that are typically produced and assessed in children under the age of three, which is a time of tremendous growth of early vocabulary skills. Second, it is difficult to assess the degree to which young children may be familiar with normed features derived from these adult-generated datasets. Third, it has been difficult to adopt standard methods to generate semantic network models of early noun learning. Here, we introduce Feats—a tool that was designed to make headway on these challenges by providing a database, the Language Learning and Meaning Acquisition (LLaMA) lab Noun Norms that extends a widely used set of feature norms McRae et al. Behavior Research Methods 37, 547–559, (2005) to include full coverage of noun concepts on a commonly used early vocabulary assessment. Feats includes several tools to facilitate exploration of features comprising early-acquired nouns, assess the developmental appropriateness of individual features using toddler-accessibility norms, and extract semantic network statistics for individual vocabulary profiles. We provide a tutorial overview of Feats. We additionally validate our approach by presenting an analysis of an overlapping set of concepts collected across prior and new data collection methods. Furthermore, using network graph analyses, we show that the extended set of norms provides novel, reliable results given their enhanced coverage.
Article
People have associations between colors and concepts that influence the way they interpret color meaning in information visualizations (e.g., charts, maps, diagrams). These associations are not limited to concrete objects (e.g., fruits, vegetables); even abstract concepts, like sleeping and driving, have systematic color-concept associations. However, color-concept associations and color meaning (color semantics) are not the same thing, and sometimes they conflict. This article describes an approach to understanding color semantics called the color inference framework. The framework shows how color semantics is highly flexible and context dependent, which makes color an effective medium for communication.
Article
“Dogs” are connected to “cats” in our minds, and “backyard” to “outdoors.” Does the structure of this semantic knowledge differ across people? Network-based approaches are a popular representational scheme for thinking about how relations between different concepts are organized. Recent research uses graph theoretic analyses to examine individual differences in semantic networks for simple concepts and how they relate to other higher-level cognitive processes, such as creativity. However, it remains ambiguous whether individual differences captured via network analyses reflect true differences in measures of the structure of semantic knowledge, or differences in how people strategically approach semantic relatedness tasks. To test this, we examine the reliability of local and global metrics of semantic networks for simple concepts across different semantic relatedness tasks. In four experiments, we find that both weighted and unweighted graph theoretic representations reliably capture individual differences in local measures of semantic networks (e.g., how related pot is to pan versus lion). In contrast, we find that metrics of global structural properties of semantic networks, such as the average clustering coefficient and shortest path length, are less robust across tasks and may not provide reliable individual difference measures of how people represent simple concepts. We discuss the implications of these results and offer recommendations for researchers who seek to apply graph theoretic analyses in the study of individual differences in semantic memory.
Article
Full-text available
Spatial distributional semantic models represent word meanings in a vector space. While able to model many basic semantic tasks, they are limited in many ways, such as their inability to represent multiple kinds of relations in a single semantic space and to directly leverage indirect relations between two lexical representations. To address these limitations, we propose a distributional graphical model that encodes lexical distributional data in a graphical structure and uses spreading activation for determining the plausibility of word sequences. We compare our model to existing spatial and graphical models by systematically varying parameters that contributing to dimensions of theoretical interest in semantic modeling. In order to be certain about what the models should be able to learn, we trained each model on an artificial corpus describing events in an artificial world simulation containing experimentally controlled verb–noun selectional preferences. The task used for model evaluation requires recovering observed selectional preferences and inferring semantically plausible but never observed verb–noun pairs. We show that the distributional graphical model performed better than all other models. Further, we argue that the relative success of this model comes from its improved ability to access the different orders of spatial representations with the spreading activation on the graph, enabling the model to infer the plausibility of noun–verb pairs unobserved in the training data. The model integrates classical ideas of representing semantic knowledge in a graph with spreading activation and more recent trends focused on the extraction of lexical distributional data from large natural language corpora.
Article
Full-text available
The neurobiological nature of semantic knowledge, i.e., the encoding and storage of conceptual information in the human brain, remains a poorly understood and hotly debated subject. Clinical data on semantic deficits and neuroimaging evidence from healthy individuals have suggested multiple cortical regions to be involved in the processing of meaning. These include semantic hubs (most notably, anterior temporal lobe, ATL) that take part in semantic processing in general as well as sensorimotor areas that process specific aspects/categories according to their modality. Biologically inspired neurocomputational models can help elucidate the exact roles of these regions in the functioning of the semantic system and, importantly, in its breakdown in neurological deficits. We used a neuroanatomically constrained computational model of frontotemporal cortices implicated in word acquisition and processing, and adapted it to simulate and explain the effects of semantic dementia (SD) on word processing abilities. SD is a devastating, yet insufficiently understood progressive neurodegenerative disease, characterised by semantic knowledge deterioration that is hypothesised to be specifically related to neural damage in the ATL. The behaviour of our brain-based model is in full accordance with clinical data—namely, word comprehension performance decreases as SD lesions in ATL progress, whereas word repetition abilities remain less affected. Furthermore, our model makes predictions about lesion- and category-specific effects of SD: our simulation results indicate that word processing should be more impaired for object- than for action-related words, and that degradation of white matter should produce more severe consequences than the same proportion of grey matter decay. In sum, the present results provide a neuromechanistic explanatory account of cortical-level language impairments observed during the onset and progress of semantic dementia.
Article
Full-text available
Language depends critically on the integration of lexical information across multiple words to derive semantic concepts. Limitations of spatiotemporal resolution have previously rendered it difficult to isolate processes involved in semantic integration. We utilized intracranial recordings in epilepsy patients (n = 58) who read written word definitions. Descriptions were either referential or non-referential to a common object. Semantically referential sentences enabled high frequency broadband gamma activation (70–150 Hz) of the inferior frontal sulcus (IFS), medial parietal cortex, orbitofrontal cortex (OFC) and medial temporal lobe in the left, language-dominant hemisphere. IFS, OFC and posterior middle temporal gyrus activity was modulated by the semantic coherence of non-referential sentences, exposing semantic effects that were independent of task-based referential status. Components of this network, alongside posterior superior temporal sulcus, were engaged for referential sentences that did not clearly reduce the lexical search space by the final word. These results indicate the existence of complementary cortical mosaics for semantic integration in posterior temporal and inferior frontal cortex.
ResearchGate has not been able to resolve any references for this publication.