ArticlePublisher preview available

Human generalization of internal representations through prototype learning with goal-directed attention

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

The world is overabundant with feature-rich information obscuring the latent causes of experience. How do people approximate the complexities of the external world with simplified internal representations that generalize to novel examples or situations? Theories suggest that internal representations could be determined by decision boundaries that discriminate between alternatives, or by distance measurements against prototypes and individual exemplars. Each provide advantages and drawbacks for generalization. We therefore developed theoretical models that leverage both discriminative and distance components to form internal representations via action-reward feedback. We then developed three latent-state learning tasks to test how humans use goal-oriented discrimination attention and prototypes/exemplar representations. The majority of participants attended to both goal-relevant discriminative features and the covariance of features within a prototype. A minority of participants relied only on the discriminative feature. Behaviour of all participants could be captured by parameterizing a model combining prototype representations with goal-oriented discriminative attention.
Distance and discriminative models of latent-state learning produce distinct generalization failures a, Prototype and exemplar models rely on a distance metric to determine state membership. This produces states whose weights on individual feature dimensions (shape of the cluster) do not depend on their informativeness for comparative decisions. A discriminative model (right) defines states on the basis of feature dimensions that are informative for separating one state from another. This produces state boundaries defined by those dimensions most informative for discrimination. b, Common grocery store items can be considered latent states of which one encounters only variable examples. Each item is associated with sets of actions. c, A distance model can fail to generalize when a highly regular, yet non-informative feature changes in novel examples. During initial learning, the packaging is highly regular across examples (left), and thus it is considered informative (middle). When this feature is altered in novel examples, a distance model struggles to generalize the rewarded action (right). d, A purely discriminative model can fail to generalize when previously non-informative features become informative in a novel context. During learning in the initial contexts, packaging was non-discriminative (left). Thus, the distance is zero between otherwise-identical alternatives learned in separate contexts (middle). This leads to generalization failure when these alternatives are brought together in a novel context (right).
… 
Algorithmic models of latent-state instrumental reinforcement learning using prototype or exemplar with discriminative attention a, Depending on feedback, novel examples either refine an existing state (that is, internal state representation) or create a new one. b, On a given trial, a ProDAtt or ExDAtt model agent encounters an example vector of features. (1) To recognize the latent context, the agent uses a Bayesian surprise threshold. Bayesian surprise is a well-described transformation of the posterior probability⁵⁹. If no state is less surprising than a threshold, the agent uses the vector to create a new state. If more than one state is below the threshold, the agent includes those states in the context. (2) The agent then calculates the mutual information of each feature. By comparing the entropy of a feature within each state to its entropy across states, the mutual information identifies feature dimensions that maximally discriminate between states in a context. Attention weights (from the mutual information) are modulated by the integrated reward history, such that shifts in reward statistics increase overall attention. (3) Attention weights are used to scale the feature values of each state. (4) The agent then uses attention-scaled features in state estimation, recalculating the surprise metric. If no state is below the second surprise threshold, the agent creates a new state. Otherwise, it selects the least surprising. Each state learns a set of values for the available actions in the task. (5) Once a state has been selected, that state’s values are used to choose an action. (6) The agent updates the state representation and the action value. It also tracks the reward history. c, In the ‘prototype states with discriminative attention’ (ProDAtt) model, states are defined by the mean and covariance of past examples, while in the ‘exemplar states with discriminative attention’ (ExDAtt) model, every past state examplar is used. d, The surprise threshold determines whether a state is included in the context. e, The attention feedback step uses mutual information to scale feature dimensions by their discriminative informativeness (left). Either the most proximal state is selected, or if the distance is greater than the threshold, the agent creates a new state (right, ProDAtt model example).
… 
Experiment 1. Human participants use discriminative attention when generalizing to novel examples a, A model using distance to define state membership can fail to generalize if a new observation differs in a highly regular, but non-informative feature (left). A model using discriminative boundaries easily generalizes in this scenario (right). b, Participants learn the latent rules defining action-reward associations. During the tutorial, participants are instructed on stochastic rewards, that a single action can garnish rewards for multiple states, and that a single state can be rewarded for multiple actions. On a given trial, participants encounter an ‘alien artefact’ activated with one of four actions. The main task is composed of two blocks. During block 1, they learn the latent states with an initial set of examples. The transition to the second block occurs without notice to the participant. During block 2, new examples are introduced that differ in a previously non-discriminative feature (left). c, Top: initially learned examples (present block 1 and 2). Bottom: the novel generalization examples (introduced in block 2). Action D is never rewarded. d, Top: without discriminative attention, novel examples are separated from the initially learned examples by their texture. Bottom: with attention, novel examples are projected into the same feature space as the initially learned examples. e, Left: during the generalization block, differences in performance on the first appearance of initially learned versus novel examples measure generalization. Middle: the difference in error rates over the course of a session for discriminatively identical paired examples also quantifies generalization. Right: when first encountering a novel example, the choice of action D for novel example indicates a weak prior over state membership. f, Left: the First-Generalization-Appearance metric reveals that most participants generalize to new examples, while a tail in the distribution indicates individual variation. Middle: Paired-Generalization difference shows similar results. Right: exploration errors are below chance for most participants. First gen, First Generalization. g, Learning curves during initial state formation are similar across participants with different First-Generalization-Appearance scores, but diverge during generalization. The shaded areas represent the s.e.m. and the dotted lines denote chance levels. Source data
… 
Experiment 2. Majority of participants use previously non-discriminative features to generalize in novel context. A subpopulation attends to one discriminative feature a, A discriminative model can fail to generalize when a feature that is non-discriminative during learning becomes the key discriminative feature for later decisions. This produces a distinct ‘Discriminative Error’. b, The novel context generalization task version 1 (CG1), where participants learn distinct state-action contingencies sequentially in two explicitly signalled contexts before entering a third explicitly signalled generalization context where all latent states are active. c, A discriminative model that learns only shape (D1M) forms a single decision boundary (top left). Alternatively, a discriminative model could form a decision boundary involving colour and shape for context 1 states, and a separate decision boundary involving only shape for context 2 states (D2M, top right). A model attending to the prototype covariance results in distinct states (bottom left). If all features receive equal attention, they each occupy an independent location (bottom right). For visualization, the size dimension is not shown and stimuli are slightly offset. d, Human participants vary in the probability of initial generalization errors (top) and Discriminative Errors (bottom). Initial generalization errors quantify the probability that participants select the rewarded action the first time they encounter each example in the generalization context. e, Idealized attention models produce predictions for distributions of errors, visualized as confusion matrices. f, Participant subpopulations produce distinct confusion matrices. g, Fitting participants’ observed confusion matrices by the idealized attention models reveals distinct patterns in the coefficient weights. Shown are the mean ± 97% CI for all participants (n = 53), HDE participants (n = 8), LDE-HIE participants (n = 19) and LDE-LIE participants (n = 26). h, Both the model comparison (left) and partial correlation coefficients (right) reveal that participants above the Discriminative Error threshold are better fit by the D1M than the D2M. Model comparison shows the mean ± s.e.m. for Pareto smoothed importance sampling leave-one-out cross-validation. D1M, discriminative model with one feature; D2M, discriminative model with two features; I, intercept; P, prototype covariance; A, all-feature attention; All, all participants; HDE, high Discriminative Error; LDE, low Discriminative Error; HIE, high initial error; LIE, low initial error. Source data
… 
This content is subject to copyright. Terms and conditions apply.
Nature Human Behaviour | Volume 7 | March 2023 | 442–463 442
nature human behaviour
Article
https://doi.org/10.1038/s41562-023-01543-7
Human generalization of internal
representations through prototype learning
with goal-directed attention
Warren Woodrich Pettine1, Dhruva Venkita Raman2, A. David Redish 3,4
& John D. Murray 1,4
The world is overabundant with feature-rich information obscuring the
latent causes of experience. How do people approximate the complexities
of the external world with simplied internal representations that
generalize to novel examples or situations? Theories suggest that
internal representations could be determined by decision boundaries
that discriminate between alternatives, or by distance measurements
against prototypes and individual exemplars. Each provide advantages
and drawbacks for generalization. We therefore developed theoretical
models that leverage both discriminative and distance components to form
internal representations via action-reward feedback. We then developed
three latent-state learning tasks to test how humans use goal-oriented
discrimination attention and prototypes/exemplar representations. The
majority of participants attended to both goal-relevant discriminative
features and the covariance of features within a prototype. A minority
of participants relied only on the discriminative feature. Behaviour of all
participants could be captured by parameterizing a model combining
prototype representations with goal-oriented discriminative attention.
The high-dimensional sensory environment we experience is struc-
tured by underlying latent states
1,2
. Internal representations of these
latent states must generalize to new observations or situations. For
example, people must not only recognize nutritious and poisonous
fruits, but must also generalize to all cases of discriminating between
them. Previous models of latent-state learning focused on conditions
where latent states were defined by the underlying reward probabil-
ity, with environmental features being irrelevant
26
(Supplementary
Table 1). However, causal latent states in the world are often signalled
by substantially fewer features than are available in the vast space of
feature dimensions we experience (for example, a poisonous fruit is
defined by its colour and shape, but the position of the sun is irrelevant).
Moreover, latent states can exist in recursive hierarchical relationships
(for example, the forest is a place where many poisonous fruits can
grow, and a poisonous fruit grows in the forest). How do people use
experiences caused by latent states to learn generalizable internal rep-
resentations? Furthermore, what role does goal-directed attention play
in generalizing internal representations to new individual latent-state
examples (observations) or new latent-state contexts (situations)? (See
the Glossary of terms in Supplementary Table 2 (refs. 719)).
The field of category learning has proposed several models of
how features can be organized into internal representations of latent
states20 (Supplementary Table 1). With ‘prototype’ or ‘exemplar’
models, new observations are categorized by their distance from
either an idealized internal-state prototype21,22, or individual past
state examples (exemplars)2326. Both of these models assign a new
observation’s internal-state membership to the state with the short-
est distance. Although a prototype model may use the covariance
Received: 5 December 2021
Accepted: 31 January 2023
Published online: 9 March 2023
Check for updates
1Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA. 2Department of Informatics, University of Sussex, Brighton, UK. 3Department of
Neuroscience, University of Minnesota, Minneapolis, MN, USA. 4These authors jointly supervised this work: A. David Redish and John D. Murray.
e-mail: john.murray@yale.edu
Content courtesy of Springer Nature, terms of use apply. Rights reserved
... While some studies have suggested the use of correlations in state spaces as mechanism [35,36], most research focusses on representation learning [9]. This process describes the discovery of an abstracted representation and emphasizes dimensionality reduction [9,37,38] or a rescaling of dimensions [39] of the state space. Neural results in this line of research have emphasized the role of the FPN [9,37,38,40], the SN [40][41][42] and the DMN [9,43] in the discovery and encoding of abstracted representations. ...
... In a Bayesian model by Soto et al. [31], consequential regions are parameterized by a mean and the width along each dimension. This shows a close resemblance to very recent developments in RL, where generalization is assumed to rely on the mean and covariance of observed examples in some cognitive space with a rescaling of dimensions according to their perceived relevance for the task [39]. Likewise, exploiting correlations in the reward probability of tasks [35,36] is conceptually similar to the model of Lee et al. [47], where generalization is based on the assumption that similar stimuli lead to similar outcomes. ...
... The specific priors are described with the models. To avoid divergent transitions, we used a non-centered parameterization of all hierarchical parameters (39). All models were run for 4 chains until those converged. ...
Preprint
Full-text available
Generalization, the transfer of knowledge to novel situations, has been studied in distinct disciplines that focus on different aspects. Here we propose a Bayesian model that assumes an exponential mapping from psychological space to outcome probabilities. This model is applicable to probabilistic reinforcement and integrates representation learning by tracking the relevance of stimulus dimensions. Since the belief state about this mapping is dependent on prior knowledge, we designed three experiments that emphasized this aspect. In all studies, we found behavior to be influenced by prior knowledge in a way that is consistent with the model. In line with the literature on representation learning, we found the representational geometry in the middle frontal gyrus to correspond to the behavioral preference for one over the other stimulus dimension and to be updated as predicted by the model. We interpret these findings as support for a common mechanism of generalization.
... The existence of spatial "cognitive maps" which efficiently organize knowledge is well-documented in the literature [45][46][47][48][49] . These maps guide attention and learning across different domains 45,46,50 . While two-dimensional representations are often emphasized in cognitive tasks, research suggests that cognitive representations are multi-dimensional and can be compressed or unfolded based on task demands 17,51 . ...
... Hybrid Concept Learning Using Bayesian Principles. Today, the most prolific theories of concept learning are hybrids that have a duality of both rule-and similarity-based interpretations (Pettine et al. 2023). One influential example is Bayesian concept learning (Figure 1d;Tenenbaum & Griffiths 2001), which uses a distribution over hypothesized category boundaries (rectangles in Figure 1d) to categorize novel stimuli (Sidebar 2.1). ...
Article
Full-text available
Generalization, defined as applying limited experiences to novel situations, represents a cornerstone of human intelligence. Our review traces the evolution and continuity of psychological theories of generalization, from its origins in concept learning (categorizing stimuli) and function learning (learning continuous input-output relationships) to domains such as reinforcement learning and latent structure learning. Historically, there have been fierce debates between approaches based on rule-based mechanisms, which rely on explicit hypotheses about environmental structure, and approaches based on similarity-based mechanisms, which leverage comparisons to prior instances. Each approach has unique advantages: Rules support rapid knowledge transfer, while similarity is computationally simple and flexible. Today, these debates have culminated in the development of hybrid models grounded in Bayesian principles, effectively marrying the precision of rules with the flexibility of similarity. The ongoing success of hybrid models not only bridges past dichotomies but also underscores the importance of integrating both rules and similarity for a comprehensive understanding of human generalization.
... Perception in neural circuits is fundamentally belief in a description of the world -we see trees and leaves, not green and brown pixels (Gibson 1977;Rao and Ballard 1999;Friston 2005;Adams et al. 2013;Sterzer et al. 2018). Perception entails a process of categorization and generalization through parallel distributed processing that depends on experience and expertise (Grossberg 1976;Hertz et al. 1991;McClelland and Rogers 2003;Pettine et al. 2023). Motivation and evaluation are computational memory processes of their own (Balleine and Dickinson 1991;Andermann and Lowell 2017;Sharpe et al. 2021). ...
Preprint
Full-text available
Current theories of decision making suggest that the neural circuits in mammalian brains (including humans) computationally combine representations of the past (memory), present (perception), and future (agentic goals) to take actions that achieve the needs of the agent. How information is represented within those neural circuits changes what computations are available to that system which changes how agents interact with their world to take those actions. We argue that the computational neuroscience of decision making provides a new microeconomic framework (neuroeconomics) that offers new opportunities to construct policies that interact with those decision-making systems to improve outcomes. After laying out the computational processes underlying decision making in mammalian brains, we present four applications of this logic with policy consequences: (1) contingency management as a treatment for addiction, (2) precommitment and the sensitivity to sunk costs, (3) media consequences for changes in housing prices after a disaster, and (4) how social interactions underlie the success (and failure) of microfinance institutions.
... Hybrid Concept Learning Using Bayesian Principles. Today, the most prolific theories of concept learning are considered hybrids and have a duality of both rule-and similaritybased interpretations (Pettine et al. 2023). One influential example is the Bayesian concept learning framework (Figure 1d; Tenenbaum & Griffiths 2001), which uses a distribution over hypothesized category boundaries (boxes in Figure 1d) to categorize novel stimuli (Sidebar 2.1). ...
Preprint
Full-text available
Generalization, defined as applying limited experiences to novel situations, represents a cornerstone of human intelligence. Our review traces the evolution and continuity of psychological theories of generalization, from origins in concept learning (categorizing stimuli) and function learning (learning continuous input-output relationships), to domains such as reinforcement learning and latent structure learning. Historically, there have been fierce debates between rule-based mechanisms, which rely on explicit hypotheses about environmental structure, and similarity-based mechanisms, which leverage comparisons to prior instances. Each approach has unique advantages: rules support rapid knowledge transfer, while similarity is computationally simple and flexible. Today, these debates have culminated in the development of hybrid models grounded in Bayesian principles, effectively marrying the precision of rules with the flexibility of similarity. The ongoing success of hybrid models not only bridges past dichotomies but also underscores the importance of integrating both rules and similarity for a comprehensive understanding of human generalization.
Article
The brain is always intrinsically active, using energy at high rates while cycling through global functional modes. Awake brain modes are tied to corresponding behavioural states. During goal-directed behaviour, the brain enters an action-mode of function. In the action-mode, arousal is heightened, attention is focused externally and action plans are created, converted to goal-directed movements and continuously updated on the basis of relevant feedback, such as pain. Here, we synthesize classical and recent human and animal evidence that the action-mode of the brain is created and maintained by an action-mode network (AMN), which we had previously identified and named the cingulo-opercular network on the basis of its anatomy. We discuss how rather than continuing to name this network anatomically, annotating it functionally as controlling the action-mode of the brain increases its distinctiveness from spatially adjacent networks and accounts for the large variety of the associated functions of an AMN, such as increasing arousal, processing of instructional cues, task general initiation transients, sustained goal maintenance, action planning, sympathetic drive for controlling physiology and internal organs (connectivity to adrenal medulla), and action-relevant bottom-up signals such as physical pain, errors and viscerosensation. In the functional mode continuum of the awake brain, the AMN-generated action-mode sits opposite the default-mode for self-referential, emotional and memory processing, with the default-mode network and AMN counterbalancing each other as yin and yang.
Preprint
Full-text available
Most existing attention prediction research focuses on salient instances like humans and objects. However, the more complex interaction-oriented attention, arising from the comprehension of interactions between instances by human observers, remains largely unexplored. This is equally crucial for advancing human-machine interaction and human-centered artificial intelligence. To bridge this gap, we first collect a novel gaze fixation dataset named IG, comprising 530,000 fixation points across 740 diverse interaction categories, capturing visual attention during human observers cognitive processes of interactions. Subsequently, we introduce the zero-shot interaction-oriented attention prediction task ZeroIA, which challenges models to predict visual cues for interactions not encountered during training. Thirdly, we present the Interactive Attention model IA, designed to emulate human observers cognitive processes to tackle the ZeroIA problem. Extensive experiments demonstrate that the proposed IA outperforms other state-of-the-art approaches in both ZeroIA and fully supervised settings. Lastly, we endeavor to apply interaction-oriented attention to the interaction recognition task itself. Further experimental results demonstrate the promising potential to enhance the performance and interpretability of existing state-of-the-art HOI models by incorporating real human attention data from IG and attention labels generated by IA.
Article
Full-text available
Humans can learn several tasks in succession with minimal mutual interference but perform more poorly when trained on multiple tasks at once. The opposite is true for standard deep neural networks. Here, we propose novel computational constraints for artificial neural networks, inspired by earlier work on gating in the primate prefrontal cortex, that capture the cost of interleaved training and allow the network to learn two tasks in sequence without forgetting. We augment standard stochastic gradient descent with two algorithmic motifs, so-called “sluggish” task units and a Hebbian training step that strengthens connections between task units and hidden units that encode task-relevant information. We found that the “sluggish” units introduce a switch-cost during training, which biases representations under interleaved training towards a joint representation that ignores the contextual cue, while the Hebbian step promotes the formation of a gating scheme from task units to the hidden layer that produces orthogonal representations which are perfectly guarded against interference. Validating the model on previously published human behavioural data revealed that it matches performance of participants who had been trained on blocked or interleaved curricula, and that these performance differences were driven by misestimation of the true category boundary.
Article
Full-text available
Human cognition recruits distributed neural processes, yet the organizing computational and functional architectures remain unclear. Here, we characterized the geometry and topography of multitask representations across the human cortex using functional magnetic resonance imaging during 26 cognitive tasks in the same individuals. We measured the representational similarity across tasks within a region and the alignment of representations between regions. Representational alignment varied in a graded manner along the sensory–association–motor axis. Multitask dimensionality exhibited compression then expansion along this gradient. To investigate computational principles of multitask representations, we trained multilayer neural network models to transform empirical visual-to-motor representations. Compression-then-expansion organization in models emerged exclusively in a rich training regime, which is associated with learning optimized representations that are robust to noise. This regime produces hierarchically structured representations similar to empirical cortical patterns. Together, these results reveal computational principles that organize multitask representations across the human cortex to support multitask cognition.
Article
Full-text available
The popularity of Bayesian statistical methods has increased dramatically in recent years across many research areas and industrial applications. This is the result of a variety of methodological advances with faster and cheaper hardware as well as the development of new software tools. Here we introduce an open source Python package named Bambi (BAyesian Model Building Interface) that is built on top of the PyMC probabilistic programming framework and the ArviZ package for exploratory analysis of Bayesian models. Bambi makes it easy to specify complex generalized linear hierarchical models using a formula notation similar to those found in R. We demonstrate Bambi’s versatility and ease of use with a few examples spanning a range of common statistical models including multiple regression, logistic regression, and mixed-effects modeling with crossed group specific effects. Additionally we discuss how automatic priors are constructed. Finally, we conclude with a discussion of our plans for the future development of Bambi.
Article
Full-text available
How do neural populations code for multiple, potentially conflicting tasks? Here we used computational simulations involving neural networks to define "lazy" and "rich" coding solutions to this context-dependent decision-making problem, which trade off learning speed for robustness. During lazy learning the input dimensionality is expanded by random projections to the network hidden layer, whereas in rich learning hidden units acquire structured representations that privilege relevant over irrelevant features. For context-dependent decision-making, one rich solution is to project task representations onto low-dimensional and orthogonal manifolds. Using behavioral testing and neuroimaging in humans and analysis of neural signals from macaque prefrontal cortex, we report evidence for neural coding patterns in biological brains whose dimensionality and neural geometry are consistent with the rich learning regime.
Preprint
Full-text available
Human cognition recruits diverse neural processes, yet the organizing computational and functional architectures remain unclear. Here, we characterized the geometry and topography of multi-task representations across human cortex using functional MRI during 26 cognitive tasks in the same subjects. We measured the representational similarity across tasks within a region, and the alignment of representations between regions. We found a cortical topography of representational alignment following a hierarchical sensory-association-motor gradient, revealing compression-then-expansion of multi-task dimensionality along this gradient. To investigate computational principles of multi-task representations, we trained multi-layer neural network models to transform empirical visual to motor representations. Compression-then-expansion organization in models emerged exclusively in a training regime where internal representations are highly optimized for sensory-to-motor transformation, and not under generic signal propagation. This regime produces hierarchically structured representations similar to empirical cortical patterns. Together, these results reveal computational principles that organize multi-task representations across human cortex to support flexible cognition.
Article
Full-text available
Asbtract Humans spend a lifetime learning, storing and refining a repertoire of motor memories. For example, through experience, we become proficient at manipulating a large range of objects with distinct dynamical properties. However, it is unknown what principle underlies how our continuous stream of sensorimotor experience is segmented into separate memories and how we adapt and use this growing repertoire. Here we develop a theory of motor learning based on the key principle that memory creation, updating and expression are all controlled by a single computation—contextual inference. Our theory reveals that adaptation can arise both by creating and updating memories (proper learning) and by changing how existing memories are differentially expressed (apparent learning). This insight enables us to account for key features of motor learning that had no unified explanation: spontaneous recovery¹, savings², anterograde interference³, how environmental consistency affects learning rate4,5 and the distinction between explicit and implicit learning⁶. Critically, our theory also predicts new phenomena—evoked recovery and context-dependent single-trial learning—which we confirm experimentally. These results suggest that contextual inference, rather than classical single-context mechanisms1,4,7–9, is the key principle underlying how a diverse set of experiences is reflected in our motor behaviour.
Article
Full-text available
SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments. This Perspective describes the development and capabilities of SciPy 1.0, an open source scientific computing library for the Python programming language.
Article
Full-text available
Prefrontal cortex (PFC) is thought to support the ability to focus on goal-relevant information by filtering out irrelevant information, a process akin to dimensionality reduction. Here, we test this dimensionality reduction hypothesis by relating a data-driven approach to characterizing the complexity of neural representation with a theoretically-supported computational model of learning. We find evidence of goal-directed dimensionality reduction within human ventromedial PFC during learning. Importantly, by using computational predictions of each participant’s attentional strategies during learning, we find that that the degree of neural compression predicts an individual’s ability to selectively attend to concept-specific information. These findings suggest a domain-general mechanism of learning through compression in ventromedial PFC. Efficient learning is akin to goal-directed dimensionality reduction, in which relevant information is highlighted and irrelevant input is ignored. Here, the authors show that ventromedial prefrontal cortex uniquely supports such learning by compressing neural codes to represent goal-specific information.
Article
Full-text available
Many models of classical conditioning fail to describe important phenomena, notably the rapid return of fear after extinction. To address this shortfall, evidence converged on the idea that learning agents rely on latent-state inferences, i.e. an ability to index disparate associations from cues to rewards (or penalties) and infer which index (i.e. latent state) is presently active. Our goal was to develop a model of latent-state inferences that uses latent states to predict rewards from cues efficiently and that can describe behavior in a diverse set of experiments. The resulting model combines a Rescorla-Wagner rule, for which updates to associations are proportional to prediction error, with an approximate Bayesian rule, for which beliefs in latent states are proportional to prior beliefs and an approximate likelihood based on current associations. In simulation, we demonstrate the model’s ability to reproduce learning effects both famously explained and not explained by the Rescorla-Wagner model, including rapid return of fear after extinction, the Hall-Pearce effect, partial reinforcement extinction effect, backwards blocking, and memory modification. Lastly, we derive our model as an online algorithm to maximum likelihood estimation, demonstrating it is an efficient approach to outcome prediction. Establishing such a framework is a key step towards quantifying normative and pathological ranges of latent-state inferences in various contexts.
Article
The curse of dimensionality plagues models of reinforcement learning and decision making. The process of abstraction solves this by constructing variables describing features shared by different instances, reducing dimensionality and enabling generalization in novel situations. Here, we characterized neural representations in monkeys performing a task described by different hidden and explicit variables. Abstraction was defined operationally using the generalization performance of neural decoders across task conditions not used for training, which requires a particular geometry of neural representations. Neural ensembles in prefrontal cortex, hippocampus, and simulated neural networks simultaneously represented multiple variables in a geometry reflecting abstraction but that still allowed a linear classifier to decode a large number of other variables (high shattering dimensionality). Furthermore, this geometry changed in relation to task events and performance. These findings elucidate how the brain and artificial systems represent variables in an abstract format while preserving the advantages conferred by high shattering dimensionality.