Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The prefrontal cortex (PFC) supports goal-directed actions and exerts cognitive control over behavior but the underlying coding and mechanism are heavily debated. We present evidence for the role of goal-coding in the PFC from two converging perspectives: computational modeling and neuronal-level analysis of monkey data. We show that neural representations of prospective goals emerge by combining a categorization process that extracts relevant behavioral abstractions from the input data and a reward-driven process that selects candidate categories depending on their adaptive value; both forms of learning have a plausible neural implementation in the PFC. Our analyses demonstrate a fundamental principle: goal-coding represents an efficient solution to cognitive control problems, analogous to efficient coding principles in other (e.g. visual) brain areas. The novel analytical-computational approach is of general interest since it applies to a variety of neurophysiological studies.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The first component could represent the bodily states and serve to drive actions, while the second one could represent the state of other objects-mostly targets to interact with-which can be internally encoded in the joint angles space as well (the reason for this particular encoding will be clear later). These targets could be observed, but they could also be imagined or set by higher-level cognitive control frontal areas such as the PFC or PMd (Genovesio et al., 2012;Stoianov et al., 2016). ...
... Indeed, it seems that the PPC explicitly encodes and maintains such goals during the whole unfolding of sequential actions (Baldauf et al., 2008). A specific goal is selected among other competitive intentions possibly under the control of the PFC and PMd (Stoianov et al., 2016) and fulfilled by setting it as a predominant belief trajectory as an attractor with a strong gain (see Equations 44, 26 and Figure 4). For example, in a typical reaching task, the goal of reaching a specific visual target corresponds to the future expectation that the agent's arm will be over that target; thus, if the agent maintains a belief over the latter, the corresponding intention links the expected belief over the future body posture with the inferred target, expressed in joint angles, encoding a specific interaction to realize. ...
... The latter, known as a "mental number line" (Stoianov et al., 2008), could be an interesting hypothesis to explore also in the context of feature coding in continuous Active Inference. Currently, distributed coding is used only in discrete Active Inference and other probabilistic models to investigate computationally highlevel cognitive functions such as planning, navigation, and control (Stoianov et al., 2016(Stoianov et al., , 2022Pezzulo et al., 2018). The two theories also differ in the nature of the input to their dynamic systems. ...
Article
Full-text available
We present a normative computational theory of how the brain may support visually-guided goal-directed actions in dynamically changing environments. It extends the Active Inference theory of cortical processing according to which the brain maintains beliefs over the environmental state, and motor control signals try to fulfill the corresponding sensory predictions. We propose that the neural circuitry in the Posterior Parietal Cortex (PPC) compute flexible intentions—or motor plans from a belief over targets—to dynamically generate goal-directed actions, and we develop a computational formalization of this process. A proof-of-concept agent embodying visual and proprioceptive sensors and an actuated upper limb was tested on target-reaching tasks. The agent behaved correctly under various conditions, including static and dynamic targets, different sensory feedbacks, sensory precisions, intention gains, and movement policies; limit conditions were individuated, too. Active Inference driven by dynamic and flexible intentions can thus support goal-directed behavior in constantly changing environments, and the PPC might putatively host its core intention mechanism. More broadly, the study provides a normative computational basis for research on goal-directed behavior in end-to-end settings and further advances mechanistic theories of active biological systems.
... The first component could represent the bodily states and serve to drive actions, while the second one could represent the state of other objects-mostly targets to interact with-which can be internally encoded in the joint angles space as well (the reason for this particular encoding will be clear later). These targets could be observed, but they could also be imagined or set by higher-level cognitive control frontal areas such as the PFC or PMd (Genovesio et al., 2012;Stoianov et al., 2016). ...
... Indeed, it seems that the PPC explicitly encodes and maintains such goals during the whole unfolding of sequential actions (Baldauf et al., 2008). A specific goal is selected among other competitive intentions possibly under the control of the PFC and PMd (Stoianov et al., 2016) and fulfilled by setting it as a predominant belief trajectory as an attractor with a strong gain (see Equations 44, 26 and Figure 4). For example, in a typical reaching task, the goal of reaching a specific visual target corresponds to the future expectation that the agent's arm will be over that target; thus, if the agent maintains a belief over the latter, the corresponding intention links the expected belief over the future body posture with the inferred target, expressed in joint angles, encoding a specific interaction to realize. ...
... The latter, known as a "mental number line" (Stoianov et al., 2008), could be an interesting hypothesis to explore also in the context of feature coding in continuous Active Inference. Currently, distributed coding is used only in discrete Active Inference and other probabilistic models to investigate computationally highlevel cognitive functions such as planning, navigation, and control (Stoianov et al., 2016(Stoianov et al., , 2022Pezzulo et al., 2018). The two theories also differ in the nature of the input to their dynamic systems. ...
Preprint
Full-text available
A bstract We present a normative computational theory of how neural circuitry may support visually-guided goal-directed actions in a dynamic environment. The model builds on Active Inference, in which perception and motor control signals are inferred through dynamic minimization of generalized prediction errors. The Posterior Parietal Cortex (PPC) is proposed to maintain constantly updated expectations, or beliefs over the environmental state, and by manipulating them through flexible intentions it is involved in dynamically generating goal-directed actions. In turn, the Dorsal Visual Stream (DVS) and the proprioceptive pathway implement generative models that translate the high-level belief into sensory-level predictions to infer targets, posture, and motor commands. A proof-of-concept agent embodying visual and proprioceptive sensors and an actuated upper limb was tested on target-reaching tasks. The agent behaved correctly under various conditions, including static and dynamic targets, different sensory feedbacks, sensory precisions, intention gains, and movement policies; limit conditions were individuated, too. Active Inference driven by dynamic and flexible intentions can thus support goal-directed behavior in constantly changing environments, and the PPC putatively hosts its core intention mechanism. More broadly, the study provides a normative basis for research on goal-directed behavior in end-to-end settings and further advances mechanistic theories of active biological systems.
... The topic of learning therefore consists of various aspects of cognitive development. It is important to take into account that different aspects of learning progress differently and mostly do not affect cognitive development in a perfect linear fashion (Luft & Buitrago,4 2005; Murre, 2014;Schneider & Chein, 2003;Stoianov, Genovesio, & Pezzulo, 2016). Even though cognitive development does not necessarily progress in a linear fashion, it is clear that important cognitive changes take place during learning, transitioning from more controlled and demanding processing to less demanding automation, which can potentially be detected by neurophysiological measures. ...
... To this aim the effect of task part and the interaction between feedback and task part and the interaction between whether there was a switch in feedback or not and task part were tested. In order to assess learning, we tested changes over time, as learning is regarded as a process occurring over time (Luft & Buitrago, 2005;Murre, 2014;Stoianov et al., 2016). We specifically analyzed the task parts instead of the blocks as (switch in) feedback was manipulated for task parts. ...
... The topic of learning consists of various aspects of development in cognition. It is important to take into account that different aspects of learning progress differently and mostly do not affect cognitive development in a linear fashion (Luft & Buitrago, 2005;Murre, 2014;Schneider & Chein, 2003;Stoianov et al., 2016). Even though cognitive development does not necessarily progress in a linear fashion, it is clear that important cognitive changes take place during learning, transitioning from more controlled and demanding processing to less demanding automation, which can be detected by neurophysiological measures. ...
Book
Full-text available
Up till now, there has been an unresolved discussion in the literature regarding the validity of non-invasive neurophysiological measures in learning. On the one hand studies have shown promise for these measures in learning (Krigolson et al., 2015; Lai et al., 2013; Leff et al., 2011), while on the other hand there has been caution for the use of such measures (Ansari et al., 2011; Brouwer et al., 2014; Cowley, 2015; Dahlstrom-Hakki et al., 2019). This dissertation was aimed to address this discussion. To this aim, this dissertation focused on experimentally examining non-invasive neurophysiological changes during learning and factors that influence these changes. Additionally, this dissertation focused on providing insight into how to move towards applying these measures validly and effectively in a wide range of settings, not only in the laboratory but also in real-world contexts. Considering all of these findings together, it becomes clear that understanding the assessment of learning through neurophysiology requires an understanding of the interplay between learning, neurophysiology, behavior, individual differences, and task-related aspects. Comprehending this complex interaction is key to resolving the discussion regarding the validity of non-invasive neurophysiological measures in learning. As the reported findings demonstrate non-invasive neurophysiology to be able to provide insight into learning, the discussion should not be focussed on whether neurophysiological measures are able to assess learning, but on how to obtain valid assessments across different learning tasks and across different trainees.Although it is clear that further development and research are needed for largescale application of neurophysiology in learning and training, the potential of neurophysiology is expected to increase as the field advances (see Chapter 8 for a more in-depth discussion). Industry could benefit from being involved in future endeavors to move the field forward. Vice versa, development and research can move forward in promising directions when taking into account the needs and experiences from the industry. The embedding of the work presented in this dissertation within the CAMPIONE project highlights how fundamental research can provide valuable contributions to application. Even though fundamental research may sometimes seem to be far removed from application, understanding the fundamentals will ultimately lead to the most valid and reliable application. I am looking forward to seeing future research contribute to our knowledge about assessment of learning through non-invasive neurophysiological measures and to seeing application of neurophysiology in training and education advance. This dissertation has paved the way and I hope many scholars and other professionals will follow up on the presented work.
... Which of these two models is more plausible? The answer to this question depends on the true causal structure of the environment (Collins and Frank, 2013;Stoianov et al., 2015). If features are independent of each other and different combinations of features need to be considered to select an appropriate action, then adding a latent variable does not help as either (i) the action selection process would be less accurate or (ii) the latent variable would require at least as many states as combinations over features to achieve a sufficient level of accuracy, thus inducing a larger number of model parameters (and hence more complexity). ...
... Our theory is also closely related to some recent computational models of cognitive control (Collins and Frank, 2013;Stoianov et al., 2015). These implementations conceive task sets (i.e., the mapping from stimuli to action associated with reward) as latent causes that need to be inferred from the observation of stimuli (called context) with a Dirichlet process (Anderson, 1990;Griffiths et al., 2007). ...
... In addition, these models propose an explicit mechanism for learning (i.e., the Dirichlet process). Here we pursue a different argument; as we appeal to the principles of Bayesian model selection to justify certain characteristics of the generative model, with some characteristics common to the other theories (Collins and Frank, 2013;Stoianov et al., 2015) and other characteristics peculiar to our formulation (for instance, the inclusion of multiple perceptual features and the hierarchical organization of latent causes). ...
Article
Full-text available
Categorization is a fundamental ability for efficient behavioral control. It allows organisms to remember the correct responses to categorical cues and not for every stimulus encountered (hence eluding computational cost or complexity), and to generalize appropriate responses to novel stimuli dependant on category assignment. Assuming the brain performs Bayesian inference, based on a generative model of the external world and future goals, we propose a computational model of categorization in which important properties emerge. These properties comprise the ability to infer latent causes of sensory experience, a hierarchical organization of latent causes, and an explicit inclusion of context and action representations. Crucially, these aspects derive from considering the environmental statistics that are relevant to achieve goals, and from the fundamental Bayesian principle that any generative model should be preferred over alternative models based on an accuracy-complexity trade-off. Our account is a step toward elucidating computational principles of categorization and its role within the Bayesian brain hypothesis.
... A second research avenue is to relax the separation of the timescales between the two levels, by selecting their inputs (e.g., level 1 takes all sensory observations as inputs, whereas level 2 only considers reward observations -and in particular, observation 1 in Figure 2b). In the future, it would be interesting to explore methods to learn hierarchical models with multiple timescales (Yamashita and Tani, 2008;Hinton et al., 2006) and effective state spaces for navigation and for task rules in self-supervised (and/or reward-guided) ways, as shown in prior work (Stoianov et al., 2022(Stoianov et al., , 2018(Stoianov et al., , 2016Niv, 2019). A third promising challenge is to avoid having the agent learn from scratch each new maze or rule. ...
... A third promising challenge is to avoid having the agent learn from scratch each new maze or rule. Recent work in transfer learning shows that it is possible to reuse existing cognitive maps or latent task representations to learn novel and similar tasks much faster (Stoianov et al., 2022;Guntupalli et al., 2023;Stoianov et al., 2016;Swaminathan et al., 2023). Extending our architecture with transfer learning abilities would be important to provide more accurate models of how animals learn cognitive maps, especially given the strong evidence for the reuse of existing neural sequences and cognitive maps in the hippocampus (Liu et al., 2019a;Farzanfar et al., 2023). ...
Preprint
Full-text available
Cognitive problem-solving benefits from cognitive maps aiding navigation and planning. Previous studies revealed that cognitive maps for physical space navigation involve hippocampal (HC) allocentric codes, while cognitive maps for abstract task space engage medial prefrontal cortex (mPFC) task-specific codes. Solving challenging cognitive tasks requires integrating these two types of maps. This is exemplified by spatial alternation tasks in multi-corridor settings, where animals like rodents are rewarded upon executing an alternation pattern in maze corridors. Existing studies demonstrated the HC - mPFC circuit's engagement in spatial alternation tasks and that its disruption impairs task performance. Yet, a comprehensive theory explaining how this circuit integrates task-related and spatial information is lacking. We advance a novel hierarchical active inference model clarifying how the HC - mPFC circuit enables the resolution of spatial alternation tasks, by merging physical and task-space cognitive maps. Through a series of simulations, we demonstrate that the model's dual layers acquire effective cognitive maps for navigation within physical (HC map) and task (mPFC map) spaces, using a biologically-inspired approach: a clone-structured cognitive graph. The model solves spatial alternation tasks through reciprocal interactions between the two layers. Importantly, disrupting inter-layer communication impairs difficult decisions, consistent with empirical findings. The same model showcases the ability to switch between multiple alternation rules. However, inhibiting message transmission between the two layers results in perseverative behavior, consistent with empirical findings. In summary, our model provides a mechanistic account of how the HC - mPFC circuit supports spatial alternation tasks and how its disruption impairs task performance.
... From a computational perspective, the possibility to group experiences into separate contexts might be key to preventing catastrophic forgetting of episodes. This idea is in keeping with a large body of literature showing that context learning improves robustness and generalization (Collins and Frank, 2013;Gershman et al., 2010;Heald et al., 2021;Sanders et al., 2020;Stoianov et al., 2015) and that architectures that use different, non-overlapping units for different tasks, such as cloned HMM (Rikhye et al., 2020) and others (Masse et al., 2018) are especially effective for continual learning. ...
... Here we describe an extension of the previous model that automatically infers K. For this, we use an adaptation of the Chinese Restaurant Process (CRP) prior over clustering of observations that adapts model complexity to data by creating new clusters as the model observes novel data (Gershman and Blei, 2012;Stoianov et al., 2015). Note that the CRP prior is applied only during navigation, on true data, and not during replay. ...
Article
We advance a novel computational theory of the hippocampal formation as a hierarchical generative model that organizes sequential experiences, such as rodent trajectories during spatial navigation, into coherent spatiotemporal contexts. We propose that the hippocampal generative model is endowed with inductive biases to identify individual items of experience (first hierarchical layer), organize them into sequences (second layer) and cluster them into maps (third layer). This theory entails a novel characterization of hippocampal reactivations as generative replay: the offline resampling of fictive sequences from the generative model, which supports the continual learning of multiple sequential experiences. We show that the model learns and efficiently retains multiple spatial navigation trajectories, by organizing them into spatial maps. Furthermore, the model reproduces flexible and prospective aspects of hippocampal dynamics that are challenging to explain within existing frameworks. This theory reconciles multiple roles of the hippocampal formation in map-based navigation, episodic memory and imagination.
... In addition to its neurons involved in predicting trial outcomes, lesions of M2 cause an increase in reaction time, which suggests an involvement in movement preparation (Smith et al. 2010). Additionally, M2 neurons encode the action selected before movement onset in a valuebased directional choice task (Sul et al. 2011). These findings suggest that primary movement encoding as well as learning and trial outcome can be reflected in these structures. ...
... At this point, our results provide strong evidence that this representation exists and can be modulated on a trial-by-trial basis. Furthermore, future investigation should also examine the possible top-down influence of the prefrontal cortex on attention and learning, and possible interactions between high-level areas and low-level representations to explain generalization of learning (Knudsen 2007;Sarter et al. 2006;Stoianov et al. 2016). ...
Article
To better understand the neural cortical underpinnings that explain behavioral differences in learning rate, we recorded single-unit activity in primary motor (M1) and secondary motor (M2) areas while rats learned to perform a directional (left or right) operant visuomotor association task. Analysis of neural activity during the early portion of the cue period showed that neural modulation in the motor cortex was most strongly associated with two task factors: the previous trial outcome (success or error) and the current trial’s directional choice (left or right). Furthermore, the fast learners, defined as those who had steeper learning curves and required fewer learning sessions to reach criterion performance, encoded the previous trial outcome factor more strongly than the directional choice factor. Conversely, the slow learners encoded directional choice more strongly than previous trial outcome. These differences in task factor encoding were observed in both the percentage of neurons and the neural modulation depth. These results suggest that fast learning is accompanied by a stronger component of previous trial outcome in the modulation representation present in motor cortex, which therefore may be a contributing factor to behavioral differences in learning rate. NEW & NOTEWORTHY We chronically recorded neural activity as rats learned a visuomotor directional choice task from a naive state. Learning rates varied. Single-unit neural modulation of two motor areas revealed that the fast learners encoded previous trial outcome more strongly than directional choice, whereas the reverse was true for slow learners. This finding provides novel evidence that rat learning rate is strongly correlated with the strength of neural modulation by previous trial outcome in motor cortex.
... For example, animals could group different experimental conditions into one single context, if the strategy to be taken is essentially the same. Albeit in a different domain, the question of "when to create a new context (or task set)" has been recently posed in a series of computational studies that use Bayesian nonparametrics coupled with reward contingencies (Collins and Koechlin 2012;Collins and Frank 2013;Stoianov et al. 2015;Maisto et al. 2016). It emerges from these studies that it is often valuable to group contexts in behaviorally relevant ways, not just according to perceptual similarity of the items; for example, in such a way that a change in context necessarily signals a change in the "rule" to be applied or the strategy to be followed (Stoianov et al. 2015). ...
... Albeit in a different domain, the question of "when to create a new context (or task set)" has been recently posed in a series of computational studies that use Bayesian nonparametrics coupled with reward contingencies (Collins and Koechlin 2012;Collins and Frank 2013;Stoianov et al. 2015;Maisto et al. 2016). It emerges from these studies that it is often valuable to group contexts in behaviorally relevant ways, not just according to perceptual similarity of the items; for example, in such a way that a change in context necessarily signals a change in the "rule" to be applied or the strategy to be followed (Stoianov et al. 2015). In keeping, an animal might be aware that a change in the environment occurred, but might mark this event as a change of context only if it implies a change in behavioral strategy. ...
Article
Balancing habitual and deliberate forms of choice entails a comparison of their respective merits-the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active Inference scheme. We illustrate our arguments with simulations that reproduce rodent spatial decisions in T-mazes. In this context, deliberation has been associated with vicarious trial and error (VTE) behavior (i.e., the fact that rodents sometimes stop at decision points as if deliberating between choice alternatives), whose neurophysiological correlates are "forward sweeps" of hippocampal place cells in the arms of the maze under consideration. Crucially, forward sweeps arise early in learning and disappear shortly after, marking a transition from deliberative to habitual choice. Our simulations show that this transition emerges as the optimal solution to the trade-off between policies that maximize reward or extrinsic value (habitual policies) and those that also consider the epistemic value of exploratory behavior (deliberative or epistemic policies)-the latter requiring VTE and the retrieval of episodic information via forward sweeps. We thus offer a novel perspective on the optimality principles that engender forward sweeps and VTE, and on their role on deliberate choice.
... For example, animals could group different experimental conditions into one single context, if the strategy to be taken is essentially the same. Albeit in a different domain, the question of "when to create a new context (or task set)" has been recently posed in a series of computational studies that use Bayesian nonparametrics coupled with reward contingencies (Collins and Koechlin 2012;Collins and Frank 2013;Stoianov et al. 2015;Maisto et al. 2016). It emerges from these studies that it is often valuable to group contexts in behaviorally relevant ways, not just according to perceptual similarity of the items; for example, in such a way that a change in context necessarily signals a change in the "rule" to be applied or the strategy to be followed (Stoianov et al. 2015). ...
... Albeit in a different domain, the question of "when to create a new context (or task set)" has been recently posed in a series of computational studies that use Bayesian nonparametrics coupled with reward contingencies (Collins and Koechlin 2012;Collins and Frank 2013;Stoianov et al. 2015;Maisto et al. 2016). It emerges from these studies that it is often valuable to group contexts in behaviorally relevant ways, not just according to perceptual similarity of the items; for example, in such a way that a change in context necessarily signals a change in the "rule" to be applied or the strategy to be followed (Stoianov et al. 2015). In keeping, an animal might be aware that a change in the environment occurred, but might mark this event as a change of context only if it implies a change in behavioral strategy. ...
Article
Full-text available
Balancing habitual and deliberate forms of choice entails a comparison of their respective merits—the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active Inference scheme. We illustrate our arguments with simulations that reproduce rodent spatial decisions in T-mazes. In this context, deliberation has been associated with vicarious trial and error (VTE) behavior (i.e., the fact that rodents sometimes stop at decision points as if deliberating between choice alternatives), whose neurophysiological correlates are “forward sweeps” of hippocampal place cells in the arms of the maze under consideration. Crucially, forward sweeps arise early in learning and disappear shortly after, marking a transition from deliberative to habitual choice. Our simulations show that this transition emerges as the optimal solution to the trade-off between policies that maximize reward or extrinsic value (habitual policies) and those that also consider the epistemic value of exploratory behavior (deliberative or epistemic policies)—the latter requiring VTE and the retrieval of episodic information via forward sweeps. We thus offer a novel perspective on the optimality principles that engender forward sweeps and VTE, and on their role on deliberate choice.
... A wide literature in computational modeling shows that once identified, such latent variables can be used for multiple purposes. For example, latent variables identified with a model that only has access to behavioral data can be used as predictors of neuronal activity, on a trial-by-trial basis [95][96][97]. ...
Article
Psychology and neuroscience are concerned with the study of behavior, of internal cognitive processes, and their neural foundations. However, most laboratory studies use constrained experimental settings that greatly limit the range of behaviors that can be expressed. While focusing on restricted settings ensures methodological control, it risks impoverishing the object of study: by restricting behavior, we might miss key aspects of cognitive and neural functions. In this article, we argue that psychology and neuroscience should increasingly adopt innovative experimental designs, measurement methods, analysis techniques and sophisticated computational models to probe rich, ecologically valid forms of behavior, including social behavior. We discuss the challenges of studying rich forms of behavior as well as the novel opportunities offered by state-of-the-art methodologies and new sensing technologies, and we highlight the importance of developing sophisticated formal models. We exemplify our arguments by reviewing some recent streams of research in psychology, neuroscience and other fields (e.g., sports analytics, ethology and robotics) that have addressed rich forms of behavior in a model-based manner. We hope that these "success cases" will encourage psychologists and neuroscientists to extend their toolbox of techniques with sophisticated behavioral models - and to use them to study rich forms of behavior as well as the cognitive and neural processes that they engage.
... Concept learning spans several areas of inquiry relevant to both neuropsychology and computational neuroscience: the way (biological) agents form concepts, how they interpret context and content, what it means to represent concepts and relationships between different elements within a context or between contexts, what similarity means, how humans categorise environments, objects, and their elements into distinct entities, what role memory plays, what counts as relevant information, and so on. A growing body of work in concept formation and structure learning employs computational frameworks, such as non-parametric Bayesian models [25][26][27][28][29], where generative models are equipped with an extendable space. The focus in this instance is on whether to incorporate additional components to the generative model, and at what point. ...
Article
Full-text available
Humans display astonishing skill in learning about the environment in which they operate. They assimilate a rich set of affordances and interrelations among different elements in particular contexts, and form flexible abstractions (i.e., concepts) that can be generalised and leveraged with ease. To capture these abilities, we present a deep hierarchical Active Inference model of goal-directed behaviour, and the accompanying belief update schemes implied by maximising model evidence. Using simulations, we elucidate the potential mechanisms that underlie and influence concept learning in a spatial foraging task. We show that the representations formed–as a result of foraging–reflect environmental structure in a way that is enhanced and nuanced by Bayesian model reduction, a special case of structure learning that typifies learning in the absence of new evidence. Synthetic agents learn associations and form concepts about environmental context and configuration as a result of inferential, parametric learning, and structure learning processes–three processes that can produce a diversity of beliefs and belief structures. Furthermore, the ensuing representations reflect symmetries for environments with identical configurations.
... The topic of learning consists of various aspects of development in cognition. It is important to take into account that different aspects of learning progress differently and mostly do not affect cognitive development in a linear fashion (Schneider and Chein, 2003;Luft and Buitrago, 2005;Murre, 2014;Stoianov et al., 2016). Even though cognitive development does not necessarily progress in a linear fashion, it is clear that important cognitive changes take place during learning, transitioning from more controlled and demanding processing to less demanding automation, which can be detected by neurophysiological measures. ...
Article
Full-text available
Although many scholars deem non-invasive measures of neurophysiology to have promise in assessing learning, these measures are currently not widely applied, neither in educational settings nor in training. How can non-invasive neurophysiology provide insight into learning and how should research on this topic move forward to ensure valid applications? The current article addresses these questions by discussing the mechanisms underlying neurophysiological changes during learning followed by a SWOT (strengths, weaknesses, opportunities, and threats) analysis of non-invasive neurophysiology in learning and training. This type of analysis can provide a structured examination of factors relevant to the current state and future of a field. The findings of the SWOT analysis indicate that the field of neurophysiology in learning and training is developing rapidly. By leveraging the opportunities of neurophysiology in learning and training (while bearing in mind weaknesses, threats, and strengths) the field can move forward in promising directions. Suggestions for opportunities for future work are provided to ensure valid and effective application of non-invasive neurophysiology in a wide range of learning and training settings.
... The topic of learning consists of various aspects of development in cognition. It is important to take into account that different aspects of learning progress differently and mostly do not affect cognitive development in a linear fashion (Schneider and Chein, 2003;Luft and Buitrago, 2005;Murre, 2014;Stoianov et al., 2016). Even though cognitive development does not necessarily progress in a linear fashion, it is clear that important cognitive changes take place during learning, transitioning from more controlled and demanding processing to less demanding automation, which can be detected by neurophysiological measures. ...
Article
Although many scholars deem non-invasive measures of neurophysiology to have promise in assessing learning, these measures are currently not widely applied, neither in educational settings nor in training. How can non-invasive neurophysiology provide insight into learning and how should research on this topic move forward to ensure valid applications? The current paper addresses these questions by discussing the mechanisms underlying neurophysiological changes during learning followed by a SWOT (strengths, weaknesses, opportunities and threats) analysis of non-invasive neurophysiology in learning and training. This type of analysis can provide a structured examination of factors relevant to the current state and future of a field. The findings of the SWOT analysis indicate that the field of neurophysiology in learning and training is developing rapidly. By leveraging the opportunities of neurophysiology in learning and training (while bearing in mind weaknesses, threats and strengths) the field can move forward in promising directions. Suggestions for opportunities for future work are provided to ensure valid and effective application of non-invasive neurophysiology in a wide range of learning and training settings.
... Here we describe an extension of the previous model that automatically infers . For this, we used an adaptation of the Chinese Restaurant Process (CRP) prior over clustering of observations that adapts model complexity to data by allowing creating novel clusters as it observes novel data [79], [80]. Note that the CRP prior was applied only during navigation, on true data, and not during replay. ...
Preprint
Full-text available
We advance a novel computational theory of the hippocampal formation as a hierarchical generative model that organizes sequential experiences, such as rodent trajectories during spatial navigation, into coherent spatiotemporal contexts. We propose that the hippocampal generative model is endowed with inductive biases to pattern-separate individual items of experience (first hierarchical layer), organize them into sequences (second layer) and cluster them into maps (third layer). This theory entails a novel characterization of hippocampal reactivations as generative replay: the offline resampling of fictive sequences from the generative model, which supports the continual learning of multiple sequential experiences. We show that the model learns and efficiently retains multiple spatial navigation trajectories, by organizing them into spatial maps. Furthermore, the hierarchical model reproduces flexible and prospective aspects of hippocampal dynamics that are challenging to explain within existing frameworks. This theory reconciles multiple roles of the hippocampal formation in map-based navigation, episodic memory and imagination.
... Active Inference is a corollary of the free energy principle that casts decision-making and behaviour as a minimisation of variational free energy (or equivalently, a maximisation of model evidence or marginal likelihood). This means that perception and action (or policy) selection are treated as inference problems [2,5,16,25,42,46,55,58,70,76,77,81] . Action selection implies evaluating the quality of a policy (or action sequence) π for each possible state an agent could be in -which corresponds to calculating the (negative) expected free energy of π , or G π . ...
Article
Full-text available
A popular distinction in the human and animal learning literature is between deliberate (or willed) and habitual (or automatic) modes of control. Extensive evidence indicates that, after sufficient learning, living organisms develop behavioural habits that permit them saving computational resources. Furthermore, humans and other animals are able to transfer control from deliberate to habitual modes (and vice versa), trading off efficiently flexibility and parsimony – an ability that is currently unparalleled by artificial control systems. Here, we discuss a computational implementation of habit formation, and the transfer of control from deliberate to habitual modes (and vice versa) within Active Inference: a computational framework that merges aspects of cybernetic theory and of Bayesian inference. To model habit formation, we endow an Active Inference agent with a mechanism to “cache” (or memorize) policy probabilities from previous trials, and reuse them to skip – in part or in full – the inferential steps of deliberative processing. We exploit the fact that the relative quality of policies, conditioned upon hidden states, is constant over trials; provided that contingencies and prior preferences do not change. This means the only quantity that can change policy selection is the prior distribution over the initial state – where this prior is based upon the posterior beliefs from previous trials. Thus, an agent that caches the quality (or the probability) of policies can safely reuse cached values to save on cognitive and computational resources – unless contingencies change. Our simulations illustrate the computational benefits, but also the limits, of three caching schemes under Active Inference. They suggest that key aspects of habitual behaviour – such as perseveration – can be explained in terms of caching policy probabilities. Furthermore, they suggest that there may be many kinds (or stages) of habitual behaviour, each associated with a different caching scheme; for example, caching associated or not associated with contextual estimation. These schemes are more or less impervious to contextual and contingency changes.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Full-text available
In this commentary, we discuss how the “incentive hope” hypothesis explains differences in food-wasting behaviors among humans. We stress that the role of relevant ecological characteristics should be taken into consideration together with the incentive hope hypothesis: population mobility, animal domestication, and food-wasting visibility.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Information seeking, especially when motivated by strategic learning and intrinsic curiosity, could render the new mechanism “incentive hope” proposed by Anselme & Güntürkün sufficient, but not necessary to explain how reward uncertainty promotes reward seeking and consumption. Naturalistic and foraging-like tasks can help parse motivational processes that bridge learning and foraging behaviors and identify their neural underpinnings.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Our target article proposes that a new concept – incentive hope – is necessary in the behavioral sciences to explain animal foraging under harsh environmental conditions. Incentive hope refers to a specific motivational mechanism in the brain – considered only in mammals and birds. But it can also be understood at a functional level, as an adaptive behavioral strategy that contributes to improve survival. Thus, this concept is an attempt to bridge across different research fields such as behavioral psychology, reward neuroscience, and behavioral ecology. Many commentaries suggest that incentive hope even could help understand phenomena beyond these research fields, including food wasting and food sharing, mental energy conservation, diverse psychopathologies, irrational decisions in invertebrates, and some aspects of evolution by means of sexual selection. We are favorable to such extensions because incentive hope denotes an unconscious process capable of working against many forms of adversity; organisms do not need to hope as a subjective feeling, but to behave as if they had this feeling. In our response, we carefully discuss each suggestion and criticism and reiterate the importance of having a theory accounting for motivation under reward uncertainty.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Poverty-related food insecurity can be viewed as a form of economic and nutritional uncertainty that can lead, in some situations, to a desire for more filling and satisfying food. Given the current obesogenic food environment and the nature of the food supply, those food choices could engage a combination of sensory, neurophysiological, and genetic factors as potential determinants of obesity.
... Remarkably, the way computational costs are formalized across these different studies is very consistent, despite their different approaches. In fact, whether one starts from an inference problem, in which the evidence for a model is maximized given some data (Genewein et al., 2015; Kingma and Welling, 2013;Tishby et al., 2000), or whether one is more generally attempting to minimize the entropy of future states (Friston, 2010), or whether one takes a decision making perspective, in which expected utility is maximized (Ortega et al., 2015), or even from the point of view of thermodynamics (Ortega and Braun, 2013;Sengupta et al., 2013), computational cost is framed as a measure of divergence between an initial belief (or prior probability distribution over a variable of interest x, such as expected reward) and an updated belief (or posterior probability distribution over the same variable x) obtained after receiving new data Kappen et al., 2012;Maisto et al., 2016Maisto et al., , 2015Polani, 2009;Stoianov et al., 2016;Tishby and Polani, 2011). This measure of difference between probability distributions, called the Kullback-Leibler (KL) divergence, represents the amount of information one needs to collect in order to update the prior to the posterior ( ( || ) log P KL P Q P Q   for probability distributions P and Q). ...
Article
In statistics and machine learning, model accuracy is traded off with complexity, which can be viewed as the amount of information extracted from the data. Here, we discuss how cognitive costs can be expressed in terms of similar information costs, i.e. as a function of the amount of information required to update a person's prior knowledge (or internal model) to effectively solve a task. We then examine the theoretical consequences that ensue from this assumption. This framework naturally explains why some tasks - for example, unfamiliar or dual tasks - are costly and permits to quantify these costs using information-theoretic measures. Finally, we discuss brain implementation of this principle and show that subjective cognitive costs can originate either from local or global capacity limitations on information processing or from increased rate of metabolic alterations. These views shed light on the potential adaptive value of cost-avoidance mechanisms.
... Using end-to-end learning (i.e., from perception to action) helps keeping the various learning procedures (state space, action-state and state-value learning) coordinated and is more effective and biologically realistic than using a staged approach-still popular in RL-in which unsupervised state space learning is seen as a generic preprocessing phase. From a computational perspective, this approach can be considered to be a novel, model-based extension of a family of Bayesian methods that have been successfully applied to decision-making problems [53][54][55][56][57]. It is also important to note that our approach does not require learning a separate state space (or sensory mapping) for each goal; rather, multiple spatial goals share the same state space-which implies that our MB-RL agent can deal natively with multiple goals. ...
Article
Full-text available
While the neurobiology of simple and habitual choices is relatively well known, our current understanding of goal-directed choices and planning in the brain is still limited. Theoretical work suggests that goal-directed computations can be productively associated to model-based (reinforcement learning) computations, yet a detailed mapping between computational processes and neuronal circuits remains to be fully established. Here we report a computational analysis that aligns Bayesian nonparametrics and model-based reinforcement learning (MB-RL) to the functioning of the hippocampus (HC) and the ventral striatum (vStr)–a neuronal circuit that increasingly recognized to be an appropriate model system to understand goal-directed (spatial) decisions and planning mechanisms in the brain. We test the MB-RL agent in a contextual conditioning task that depends on intact hippocampus and ventral striatal (shell) function and show that it solves the task while showing key behavioral and neuronal signatures of the HC—vStr circuit. Our simulations also explore the benefits of biological forms of look-ahead prediction (forward sweeps) during both learning and control. This article thus contributes to fill the gap between our current understanding of computational algorithms and biological realizations of (model-based) reinforcement learning.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Poverty-related food insecurity can be viewed as a form of economic and nutritional uncertainty, that can lead, in some situations, to a desire for more filling and satisfying food. Given the current obesogenic food environment and the nature of the food supply, those food choices could engage a combination of sensory, neurophysiological and genetic factors as potential determinants of obesity.
... In sum, adapting to uncertain environments may require the convergence of cognitive, motivational, and bodily factors, which jointly conspire to make the animal fit to its niche. Exploratory and novelty-seeking behavior need not necessarily be treated in terms of "bonuses," "hopes," or other correctives of a quintessentially reward-seeking behavior, but emerge from normative considerations within belief-based approaches, namely, planning as (active) inference (Attias 2003;Botvinick & Toussaint 2012;Donnarumma et al. 2016;Friston et al. 2016b;Maisto et al. 2015;Pezzulo et al. 2013;Pezzulo & Rigoli 2011;Stoianov et al. 2015;2018). ...
Article
Food uncertainty has the effect of invigorating food-related responses. Psychologists have noted that mammals and birds respond more to a conditioned stimulus that unreliably predicts food delivery, and ecologists have shown that animals (especially small passerines) consume and/or hoard more food and can get fatter when access to that resource is unpredictable. Are these phenomena related? We think they are. Psychologists have proposed several mechanistic interpretations, while ecologists have suggested a functional interpretation: the effect of unpredictability on fat reserves and hoarding behavior is an evolutionary strategy acting against the risk of starvation when food is in short supply. Both perspectives are complementary, and we argue that the psychology of incentive motivational processes can shed some light on the causal mechanisms leading animals to seek and consume more food under uncertainty in the wild. Our theoretical approach is in agreement with neuroscientific data relating to the role of dopamine, a neurotransmitter strongly involved in incentive motivation, and its plausibility has received some explanatory and predictive value with respect to Pavlovian phenomena. Overall, we argue that the occasional and unavoidable absence of food rewards has motivational effects (called incentive hope) that facilitate foraging effort. It is shown that this hypothesis is computationally tenable, leading foragers in an unpredictable environment to consume more food items and to have higher long-term energy storage than foragers in a predictable environment.
... Our proposal emphasizes the centrality of goals and goal directedness for motivated control [10,14,29,42,54,[71][72][73][74][75][76][77][78][79]. The rationale for deep goal hierarchies is to generate, prioritize (i.e., raise the precision and incentive salience) and achieve goals at multiple levels of abstraction, not to trigger simpler-to-more-complex stimulus-response mappings. ...
Article
Full-text available
Motivated control refers to the coordination of behaviour to achieve affectively valenced outcomes or goals. The study of motivated control traditionally assumes a distinction between control and motivational processes, which map to distinct (dorsolateral versus ventromedial) brain systems. However, the respective roles and interactions between these processes remain controversial. We offer a novel perspective that casts control and motivational processes as complementary aspects - goal propagation and prioritization, respectively - of active inference and hierarchical goal processing under deep generative models. We propose that the control hierarchy propagates prior preferences or goals, but their precision is informed by the motivational context, inferred at different levels of the motivational hierarchy. The ensuing integration of control and motivational processes underwrites action and policy selection and, ultimately, motivated behaviour, by enabling deep inference to prioritize goals in a context-sensitive way.
... In other words, we have avoided many important questions about the construction and exploration of model spaces in the absence of a full model (Gershman & Niv, 2010;Navarro & Perfors, 2011;Collins & Frank, 2013;Tervo et al., 2016). This calls on things like nonparametric Bayesian methods that have been used to model cognitive control over learning; e.g., (Collins & Koechlin, 2012;Collins & Frank, 2013) and the emergence of goal codes (Stoianov, Genovesio, & Pezzulo, 2016). Indeed, this theoretical line of thinking has enabled neuroimaging studies to identify the functional (prefrontal cortical) anatomy of structure learning in terms of "hypothesis testing for accepting versus rejecting newly created strategies" (Donoso et al., 2014). ...
Article
Full-text available
This article offers a formal account of curiosity and insight in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how people attain insight and understanding using just a handful of observations, which are solicited through curious behavior. We use simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies. This epistemic behavior closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity. We then move from epistemic learning to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries (i.e., invariances or rules) in their generative models. The ensuing Bayesian model reduction evinces mechanisms associated with sleep and has all the hallmarks of "aha" moments. This formulation moves toward a computational account of consciousness in the pre-Cartesian sense of sharable knowledge (i.e., con: "together"; scire: "to know").
... Information theory measures such as 372 Levin · Pezzulo · Finkelstein mutual information (e.g., between input stimuli and neural activation) allow one to analyze the information content of spike trains of a neuron (e.g., whether a neuron carries information concerning the color or size of a visual stimulus) (175) or a population of neurons. For example, an analysis based on conditional mutual information revealed that specific prefrontal neurons in monkey prefrontal cortex carry information about the prospective action goal that is unconfounded by sensory characteristics (176). The decoding of neural activity is another powerful approach for understanding information processing in the brain. ...
Article
Living systems exhibit remarkable abilities to self-assemble, regenerate, and remodel complex shapes. How cellular networks construct and repair specific anatomical outcomes is an open question at the heart of the next-generation science of bioengineering. Developmental bioelectricity is an exciting emerging discipline that exploits endogenous bioelectric signaling among many cell types to regulate pattern formation. We provide a brief overview of this field, review recent data in which bioelectricity is used to control patterning in a range of model systems, and describe the molecular tools being used to probe the role of bioelectrics in the dynamic control of complex anatomy. We suggest that quantitative strategies recently developed to infer semantic content and information processing from ionic activity in the brain might provide important clues to cracking the bioelectric code. Gaining control of the mechanisms by which large-scale shape is regulated in vivo will drive transformative advances in bioengineering, regenerative medicine, and synthetic morphology, and could be used to therapeutically address birth defects, traumatic injury, and cancer.
... The importance of these models becomes evident if one considers that internal models develop while the agent learns to interact with the external environment and to exercise its mastery and control over it. Encoding the statistics of external stimuli is not sufficient for this; what would be more useful, for example, is modelling the way external inputs are sampled, categorizing sensory-motor events in ways that afford goal achievement, or recognizing similarities in task space rather other than (only) in stimulus space [140][141][142]. The importance of drive-and goal-related processes to internal modelling and learning becomes even more evident in that the agent's models develop in close cooperation with the process of fulfilling internal allostatic processes, and then progressively afford the realization of increasingly more abstract goal states [65]. ...
Article
Full-text available
There is an on-going debate in cognitive (neuro) science and philosophy between classical cognitive theory and embodied, embedded, extended, and enactive (“4-Es”) views of cognition—a family of theories that emphasize the role of the body in cognition and the importance of brain-body-environment interaction over and above internal representation. This debate touches foundational issues, such as whether the brain internally represents the external environment, and “infers” or “computes” something. Here we focus on two (4-Es-based) criticisms to traditional cognitive theories—to the notions of passive perception and of serial information processing—and discuss alternative ways to address them, by appealing to frameworks that use, or do not use, notions of internal modelling and inference. Our analysis illustrates that: an explicitly inferential framework can capture some key aspects of embodied and enactive theories of cognition; some claims of computational and dynamical theories can be reconciled rather than seen as alternative explanations of cognitive phenomena; and some aspects of cognitive processing (e.g., detached cognitive operations, such as planning and imagination) that are sometimes puzzling to explain from enactive and non-representational perspectives can, instead, be captured nicely from the perspective that internal generative models and predictive processing mediate adaptive control loops.
... Our results can be interpreted within biologically-motivated models of choice in which model-free mechanisms of choice mediate decisions in well-known contexts, but more flexible model-based mechanisms are called upon when the choice context changes, such as for example when the usually rewarded branch in a T-maze is devaluated or task contingencies change 6,9,32,[53][54][55][56][57][58] . The rationale for invoking a model-based system after contextual changes is that it permits the choice offers to be evaluated anew, thus avoiding stereotyped behaviour that would be maladaptive. ...
Article
Full-text available
During decisions, animals balance goal achievement and effort management. Despite physical exercise and fatigue significantly affecting the levels of effort that an animal exerts to obtain a reward, their role in effort-based choice and the underlying neurochemistry are incompletely known. In particular, it is unclear whether fatigue influences decision (cost-benefit) strategies flexibly or only post-decision action execution and learning. To answer this question, we trained mice on a T-maze task in which they chose between a high-cost, high-reward arm (HR), which included a barrier, and a low-cost, low-reward arm (LR), with no barrier. The animals were parametrically fatigued immediately before the behavioural tasks by running on a treadmill. We report a sharp choice reversal, from the HR to LR arm, at 80% of their peak workload (PW), which was temporary and specific, as the mice returned to choose the HC when the animals were successively tested at 60% PW or in a two-barrier task. These rapid reversals are signatures of flexible choice. We also observed increased subcortical dopamine levels in fatigued mice: a marker of individual bias to use model-based control in humans. Our results indicate that fatigue levels can be incorporated in flexible cost-benefits computations that improve foraging efficiency.
... The proposed model uses probabilistic (Bayesian) inference, in keeping with a number of recent studies applying the same principles across many domains of cognitive science, including action, perception, decision, and social interaction [8,19,21,22,24,25,[36][37][38]41,60,70,78,80,81,87,101,102,110]. A key reason for using probabilistic schemes across all these domains is that they are all plagued by various sources of uncertainty, and Bayesian inference permits to handle uncertainty in rigorous ways. ...
Article
Full-text available
Turn-taking is a preverbal skill whose mastering constitutes an important precondition for many social interactions and joint actions. However, the cognitive mechanisms supporting turn-taking abilities are still poorly understood. Here, we propose a computational analysis of turn-taking in terms of two general mechanisms supporting joint actions: action prediction (e.g., recognizing the interlocutor’s message and predicting the end of turn) and signaling (e.g., modifying one’s own speech to make it more predictable and discriminable). We test the hypothesis that in a simulated conversational scenario dyads using these two mechanisms can recognize the utterances of their co-actors faster, which in turn permits them to give and take turns more efficiently. Furthermore, we discuss how turn-taking dynamics depend on the fact that agents cannot simultaneously use their internal models for both action (or messages) prediction and production, as these have different requirements—or, in other words, they cannot speak and listen at the same time with the same level of accuracy. Our results provide a computational-level characterization of turn-taking in terms of cognitive mechanisms of action prediction and signaling that are shared across various interaction and joint action domains.
... In computational motor control, it is widely assumed that action representations stem from (probabilistic) internal models (Wolpert et al., 2003;Jeannerod, 2006;Shadmehr and Krakauer, 2008;Friston et al., 2010Friston et al., , 2017Pezzulo et al., 2015, in press;Donnarumma et al., 2016;Maisto et al., 2016;Stoianov et al., 2016). These models can be hierarchical, with higher hierarchical levels encoding more abstract and distal aspects and lower hierarchical levels encoding more proximal aspects that are related to action performance. ...
Article
Full-text available
Humans excel at recognizing (or inferring) another's distal intentions, and recent experiments suggest that this may be possible using only subtle kinematic cues elicited during early phases of movement. Still, the cognitive and computational mechanisms underlying the recognition of intentional (sequential) actions are incompletely known and it is unclear whether kinematic cues alone are sufficient for this task, or if it instead requires additional mechanisms (e.g., prior information) that may be more difficult to fully characterize in empirical studies. Here we present a computationally-guided analysis of the execution and recognition of intentional actions that is rooted in theories of motor control and the coarticulation of sequential actions. In our simulations, when a performer agent coarticulates two successive actions in an action sequence (e.g., “reach-to-grasp” a bottle and “grasp-to-pour”), he automatically produces kinematic cues that an observer agent can reliably use to recognize the performer's intention early on, during the execution of the first part of the sequence. This analysis lends computational-level support for the idea that kinematic cues may be sufficiently informative for early intention recognition. Furthermore, it suggests that the social benefits of coarticulation may be a byproduct of a fundamental imperative to optimize sequential actions. Finally, we discuss possible ways a performer agent may combine automatic (coarticulation) and strategic (signaling) ways to facilitate, or hinder, an observer's action recognition processes.
... Another problem with a visual extrapolation explanation is that it is not immediately clear why eye movements should go proactively to the object (and not, for example, any future predicted location before the object) without a notion that grasping the object is the agent's goal. While it may not be mandatory to engage the (generative model of the) motor system to solve this specific task, doing so would automatically produce an advance understanding of the situation that is speaks to one's own action goals ("motor understanding"); in turn, this may have additional benefits such as segmenting action observation in meaningful (e.g., goal and subgoal-related ways, Donnarumma, Maisto, & Pezzulo, 2016;Stoianov, Genovesio, & Pezzulo, 2015) and permitting fast planning of complementary or adversarial actions in social settings Pezzulo, Iodice, Donnarumma, Dindo, & Knoblich, 2017). ...
Article
Full-text available
We present a novel computational model that describes action perception as an active inferential process that combines motor prediction (the reuse of our own motor system to predict perceived movements) and hypothesis testing (the use of eye movements to disambiguate amongst hypotheses). The system uses a generative model of how (arm and hand) actions are performed to generate hypothesis-specific visual predictions, and directs saccades to the most informative places of the visual scene to test these predictions – and underlying hypotheses. We test the model using eye movement data from a human action observation study. In both the human study and our model, saccades are proactive whenever context affords accurate action prediction; but uncertainty induces a more reactive gaze strategy, via tracking the observed movements. Our model offers a novel perspective on action observation that highlights its active nature based on prediction dynamics and hypothesis testing.
... ( Stoianov, Genovesio, & Pezzulo, 2016), permits the monitoring of environmental uncertainty 110 and either exerts stronger-but flexible-inhibitory control in safe conditions or reduced 111 inhibitory regulation in other situations to allow automatic defense mechanisms to regulate 112 behavior ( Thayer, Ahs, Fredrikson, Sollers, & Wager, 2012). ...
Article
Full-text available
Performance and injury prevention in elite soccer players are typically investigated from physical-tactical, biomechanical and metabolic perspectives. However, executive functions, visuospatial abilities, and psychophysiological adaptability or resilience are also fundamental for efficiency and wellbeing in sports. Based on previous research associating autonomic flexibility with prefrontal cortical control, we designed a novel integrated autonomic biofeedback training method called " Neuroplus " to improve resilience, visual attention and injury prevention. Herein, we introduce the method and provide an evaluation of twenty elite soccer players from the Italian Soccer High Division (Serie-A): ten players trained with Neuroplus and ten trained with a control treatment. The assessments included psychophysiological stress profiles, a visual search task and indexes of injury prevention, which were measured pre-and post-treatment. The analysis showed a significant enhancement of physiological adaptability, recovery following stress, visual selective attention and injury prevention that were specific to the Neuroplus group. Enhancing the interplay between autonomic and cognitive functions through biofeedback may become a key principle for obtaining excellence and wellbeing in sports. To our knowledge, this is the first evidence that shows improvement in visual selective attention following intense autonomic biofeedback.
... One can use methods such as Bayesian inference to infer probabilistically the latter from the former [46]. Deep neural networks [47] and Bayesian non-parametric methods [48,49] are widely used in machine learning and computational neuroscience to infer latent states of this kind directly from data. ...
Article
Full-text available
It is widely assumed in developmental biology and bioengineering that optimal understanding and control of complex living systems follows from models of molecular events. The success of reductionism has overshadowed attempts at top-down models and control policies in biological systems. However, other fields, including physics, engineering and neuroscience, have successfully used the explanations and models at higher levels of organization, including least-action principles in physics and controltheoretic models in computational neuroscience. Exploiting the dynamic regulation of pattern formation in embryogenesis and regeneration requires new approaches to understand how cells cooperate towards large-scale anatomical goal states. Here, we argue that top-down models of pattern homeostasis serve as proof of principle for extending the current paradigm beyond emergence and molecule-level rules. We define top-down control in a biological context, discuss the examples of how cognitive neuroscience and physics exploit these strategies, and illustrate areas in which they may offer significant advantages as complements to the mainstream paradigm. By targeting system controls at multiple levels of organization and demystifying goal-directed (cybernetic) processes, top-down strategies represent a roadmap for using the deep insights of other fields for transformative advances in regenerative medicine and systems bioengineering. © 2016 The Author(s) Published by the Royal Society. All rights reserved.
... The independence of coding of the two absolute magnitudes thus originates as early as their initial representation and is maintained while calculating the relative value . In this series of studies, goal encoding appears as the first magnitude-independent representation, consistent with goal generation and monitoring as important functions of the PFdl (Falcone et al. 2015;Genovesio et al. 2006aGenovesio et al. , 2008Genovesio et al. , 2014aGenovesio and Ferraina 2014;Kusunoki et al. 2009;Rainer et al. 1999;Tsujimoto 2008) and with the proposed function of goal coding as a general organizational principle in the PF (Stoianov et al. 2015). ...
Article
The estimation of space and time can interfere with each other, and neuroimaging studies have shown overlapping activation in the parietal and prefrontal cortical areas. We used duration and distance discrimination tasks to determine whether space and time share resources in prefrontal cortex (PF) neurons. Monkeys were required to report which of two stimuli, a red circle or blue square, presented sequentially, were longer and farther, respectively, in the duration and distance tasks. In a previous study, we showed that relative duration and distance are coded by different populations of neurons and that the only common representation is related to goal coding. Here, we examined the coding of absolute duration and distance. Our results support a model of independent coding of absolute duration and distance metrics by demonstrating that not only relative magnitude but also absolute magnitude are independently coded in the PF. New & noteworthy: Human behavioral studies have shown that spatial and duration judgments can interfere with each other. We investigated the neural representation of such magnitudes in the prefrontal cortex. We found that the two magnitudes are independently coded by prefrontal neurons. We suggest that the interference among magnitude judgments might depend on the goal rather than the perceptual resource sharing.
... Other predictive models could be developed, the generative model illustrated above is very simple and does not take advantage of the internal degrees of freedom. A key generalization will be integrating planning mechanisms that may allow, for example, the robot to proactively avoid obstacles or collisions during movement-or more generally, to consider future ( predicted) and not only currently sensed contingencies [17,[52][53][54][55][56]. Planning mechanisms have been described under the active inference scheme and can solve challenging problems such as the mountaincar problem [5], and can thus been seamlessly integrated in the model presented here-speaking to the scalability of the active inference scheme. ...
Article
Full-text available
Active inference is a general framework for perception and action that is gaining prominence in computational and systems neuroscience but is less known outside these fields. Here, we discuss a proof-of-principle implementation of the active inference scheme for the control or the 7-DoF arm of a (simulated) PR2 robot. By manipulating visual and proprioceptive noise levels, we show under which conditions robot control under the active inference scheme is accurate. Besides accurate control, our analysis of the internal system dynamics (e.g. the dynamics of the hidden states that are inferred during the inference) sheds light on key aspects of the framework such as the quintessentially multimodal nature of control and the differential roles of proprioception and vision. In the discussion, we consider the potential importance of being able to implement active inference in robots. In particular, we briefly review the opportunities for modelling psychophysiological phenomena such as sensory attenuation and related failures of gain control, of the sort seen in Parkinson's disease. We also consider the fundamental difference between active inference and optimal control formulations, showing that in the former the heavy lifting shifts from solving a dynamical inverse problem to creating deep forward or generative models with dynamics, whose attracting sets prescribe desired behaviours.
... Although these reports consistently indicate the presence of anticipatory activity in areas that are predictive of future choice, there are contrasting results in neurophysiology. While some previous neurophysiological studies have found some evidence (Maoz et al., 2013) others have failed to provide similar evidence for the prefrontal cortex (PF; Kim and Shadlen, 1999;Katsuki et al., 2014), notwithstanding its function in goal-encoding (Tanji and Hoshi, 2001;Mushiake et al., 2006;Genovesio et al., 2008Genovesio et al., , 2014aYamagata et al., 2012;Genovesio and Ferraina, 2014;Falcone et al., 2015;Stoianov et al., 2016), its activation during free-choice tasks in humans (Rowe et al., 2005;Thimm et al., 2012) and the possibility of biasing target selection by electrical stimulation (Opris et al., 2005). The latter suggests that PF activity during and, likely, before presentation of a stimulus influences future choices when the correct choice is not dictated by external instructions or rules. ...
Article
Full-text available
When choices are made freely, they might emerge from pre-existing neural activity. However, whether neurons in the prefrontal cortex (PF) show this anticipatory effect and, if so, in which part of the process they are involved is still debated. To answer this question, we studied PF activity in monkeys while they performed a strategy task. In this task when the stimulus changed from the previous trial, the monkeys had to shift their response to one of two spatial goals, excluding the one that had been previously selected. Under this free-choice condition, the prestimulus activity of the same neurons that are involved in decision and motor processes predicted future choices. These neurons developed the same goal preferences during the prestimulus presentation as they did later in the decision phase. In contrast, the same effect was not observed in motor-only neurons and it was present but weaker in decision-only neurons. Overall, our results suggest that the PF neuronal activity predicts upcoming actions mainly through the decision-making network that integrate in time decision and motor task aspects.
... We should look at cognitive (and brain) functions, including the most advanced (or "higher-cognitive") functions, within an interacted and control-theoretic frameworkas activities that an organism performs in interaction with its environment, rather than in terms of modular computational operations over discrete symbols independent from perception and action systems. This idea can be traced back to early theories in cybernetics, pragmatism, and ecological psychology (Ashby 1952;Craik 1943;Gibson 1977;Wiener 1948) and has been often reproposed, in slightly different forms, in disciplines such as cognitive science, neuroscience, robotics, and philosophy (Cisek 1999;Cisek & Kalaska 2010;Clark 1998;Engel et al. 2013;Pezzulo 2011;Pezzulo & Castelfranchi 2009;Pezzulo et al. 2015;Pfeifer & Scheier 1999;Scott 2012;Stoianov et al. 2016;Varela et al. 1992). contributes to this debate both theoretically and empirically. ...
Article
Full-text available
Neural reuse is a form of neuroplasticity whereby neural elements originally developed for one purpose are put to multiple uses. A diverse behavioral repertoire is achieved by means of the creation of multiple, nested, and overlapping neural coalitions, in which each neural element is a member of multiple different coalitions and cooperates with a different set of partners at different times. Neural reuse has profound implications for how we think about our continuity with other species, for how we understand the similarities and differences between psychological processes, and for how best to pursue a unified science of the mind. After Phrenology: Neural Reuse and the Interactive Brain (Anderson 2014; henceforth After Phrenology in this Précis) surveys the terrain and advocates for a series of reforms in psychology and cognitive neuroscience. The book argues that, among other things, we should capture brain function in a multidimensional manner, develop a new, action-oriented vocabulary for psychology, and recognize that higher-order cognitive processes are built from complex configurations of already evolved circuitry.
... We should look at cognitive (and brain) functions, including the most advanced (or "higher-cognitive") functions, within an interacted and control-theoretic frameworkas activities that an organism performs in interaction with its environment, rather than in terms of modular computational operations over discrete symbols independent from perception and action systems. This idea can be traced back to early theories in cybernetics, pragmatism, and ecological psychology (Ashby 1952;Craik 1943;Gibson 1977;Wiener 1948) and has been often reproposed, in slightly different forms, in disciplines such as cognitive science, neuroscience, robotics, and philosophy (Cisek 1999;Cisek & Kalaska 2010;Clark 1998;Engel et al. 2013;Pezzulo 2011;Pezzulo & Castelfranchi 2009;Pezzulo et al. 2015;Pfeifer & Scheier 1999;Scott 2012;Stoianov et al. 2016;Varela et al. 1992). contributes to this debate both theoretically and empirically. ...
Article
Full-text available
Neural reuse is a form of neuroplasticity whereby neural elements originally developed for one purpose are put to multiple uses. A diverse behavioral repertoire is achieved by means of the creation of multiple, nested, and overlapping neural coalitions, in which each neural element is a member of multiple different coalitions and cooperates with a different set of partners at different times. Neural reuse has profound implications for how we think about our continuity with other species, for how we understand the similarities and differences between psychological processes, and for how best to pursue a unified science of the mind. After Phrenology: Neural Reuse and the Interactive Brain (Anderson 2014; henceforth After Phrenology in this Précis) surveys the terrain and advocates for a series of reforms in psychology and cognitive neuroscience. The book argues that, among other things, we should capture brain function in a multidimensional manner, develop a new, action-oriented vocabulary for psychology, and recognize that higher-order cognitive processes are built from complex configurations of already evolved circuitry.
... We presented a novel computational theory of human problem solving that is based on probabilistic inference augmented with a subgoaling mechanism. Probabilistic inference methods are increasingly used to explain a variety of cognitive, perceptual and motor tasks, including goaldirected decisions and planning [17,23,30,34,37,38,87]. Here we show that probabilistic inference, when enhanced with a subgoaling mechanism, can explain various aspects of human problem solving, too, including its idiosyncrasies and deficits, such as the human sensitivity to the structure of the problem space, and patient deficits in handling counterintuitive moves and goal-subgoal conflicts. ...
Article
Full-text available
How do humans and other animals face novel problems for which predefined solutions are not available? Human problem solving links to flexible reasoning and inference rather than to slow trial-and-error learning. It has received considerable attention since the early days of cognitive science, giving rise to well known cognitive architectures such as SOAR and ACT-R, but its computational and brain mechanisms remain incompletely known. Furthermore, it is still unclear whether problem solving is a "specialized" domain or module of cognition, in the sense that it requires computations that are fundamentally different from those supporting perception and action systems. Here we advance a novel view of human problem solving as probabilistic inference with subgoaling. In this perspective, key insights from cognitive architectures are retained such as the importance of using subgoals to split problems into subproblems. However, here the underlying computations use probabilistic inference methods analogous to those that are increasingly popular in the study of perception and action systems. To test our model we focus on the widely used Tower of Hanoi (ToH) task, and show that our proposed method can reproduce characteristic idiosyncrasies of human problem solvers: their sensitivity to the "community structure" of the ToH and their difficulties in executing so-called "counterintuitive" movements. Our analysis reveals that subgoals have two key roles in probabilistic inference and problem solving. First, prior beliefs on (likely) useful subgoals carve the problem space and define an implicit metric for the problem at hand-a metric to which humans are sensitive. Second, subgoals are used as waypoints in the probabilistic problem solving inference and permit to find effective solutions that, when unavailable, lead to problem solving deficits. Our study thus suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem solving and its deficits.
... This interpretation brings an additional piece of evidence that number-motor interactions arise only with relevant object-directed actions. These interactions might take place because numerical magnitude and object-directed grasping actions are both represented within a very close if not overlapping frontoparietal network along the dorsal stream (Badets et al., 2012;Castiello, 2005;Pesenti et al., 2000;Simon et al., 2002;Stoianov et al., 2016). Moreover, they also fit within the ATOM proposal (Bueti & Walsh, 2009;Walsh, 2003) that postulates a core system for the processing of various magnitudes. ...
Article
Full-text available
Numerical magnitude and specific grasping action processing have been shown to interfere with each other because some aspects of numerical meaning may be grounded in sensorimotor transformation mechanisms linked to finger grip control. However, how specific these interactions are to grasping actions is still unknown. The present study tested the specificity of the number-grip relationship by investigating how the observation of different closing-opening stimuli that might or not refer to prehension-releasing actions was able to influence a random number generation task. Participants had to randomly produce numbers after they observed action stimuli representing either closure or aperture of the fingers, the hand or the mouth, or a colour change used as a control condition. Random number generation was influenced by the prior presentation of finger grip actions, whereby observing a closing finger grip led participants to produce small rather than large numbers, whereas observing an opening finger grip led them to produce large rather than small numbers. Hand actions had reduced or no influence on number production; mouth action influence was restricted to opening, with an overproduction of large numbers. Finally, colour changes did not influence number generation. These results show that some characteristics of observed finger, hand and mouth grip actions automatically prime number magnitude, with the strongest effect for finger grasping. The findings are discussed in terms of the functional and neural mechanisms shared between hand actions and number processing, but also between hand and mouth actions. The present study provides converging evidence that part of number semantics is grounded in sensory-motor mechanisms.
Article
Full-text available
Cognitive problem-solving benefits from cognitive maps aiding navigation and planning. Physical space navigation involves hippocampal (HC) allocentric codes, while abstract task space engages medial prefrontal cortex (mPFC) task-specific codes. Previous studies show that challenging tasks, like spatial alternation, require integrating these two types of maps. The disruption of the HC-mPFC circuit impairs performance. We propose a hierarchical active inference model clarifying how this circuit solves spatial interaction tasks by bridging physical and task-space maps. Simulations demonstrate that the model’s dual layers develop effective cognitive maps for physical and task space. The model solves spatial alternation tasks through reciprocal interactions between the two layers. Disrupting its communication impairs decision-making, which is consistent with empirical evidence. Additionally, the model adapts to switching between multiple alternation rules, providing a mechanistic explanation of how the HC-mPFC circuit supports spatial alternation tasks and the effects of disruption.
Article
Transitive inference (TI) is a cognitive task that assesses an organism’s ability to infer novel relations between items based on previously acquired knowledge. TI is known for exhibiting various behavioral and neural signatures, such as the serial position effect (SPE), symbolic distance effect (SDE), and the brain’s capacity to maintain and merge separate ranking models. We propose a novel framework that casts TI as a probabilistic preference learning task, using one-parameter Mallows models. We present a series of simulations that highlight the effectiveness of our novel approach. We show that the Mallows ranking model natively reproduces SDE and SPE. Furthermore, extending the model using Bayesian selection showcases its capacity to generate and merge ranking hypotheses as pairs with connecting symbols. Finally, we employ neural networks to replicate Mallows models, demonstrating how this framework aligns with observed prefrontal neural activity during TI. Our innovative approach sheds new light on the nature of TI, emphasizing the potential of probabilistic preference learning for unraveling its underlying neural mechanisms.
Article
Full-text available
To adapt to a changing world, we must be able to switch between rules already learned and, at other times, learn rules anew. Often we must do both at the same time, switching between known rules while also constantly re-estimating them. Here, we show these two processes, rule switching and rule learning, rely on distinct but intertwined computations, namely fast inference and slower incremental learning. To this end, we studied how monkeys switched between three rules. Each rule was compositional, requiring the animal to discriminate one of two features of a stimulus and then respond with an associated eye movement along one of two different response axes. By modeling behavior we found the animals learned the axis of response using fast inference ( rule switching ) while continuously re-estimating the stimulus-response associations within an axis ( rule learning ). Our results shed light on the computational interactions between rule switching and rule learning, and make testable neural predictions for these interactions.
Article
We discuss how uncertainty underwrites exploration and epistemic foraging from the perspective of active inference: a generic scheme that places pragmatic (utility maximization) and epistemic (uncertainty minimization) imperatives on an equal footing – as primary determinants of proximal behavior. This formulation contextualizes the complementary motivational incentives for reward-related stimuli and environmental uncertainty, offering a normative treatment of their trade-off.
Article
Full-text available
The prefrontal cortex (PF) has a key role in learning rules and generating associations between stimuli and responses also called conditional motor learning. Previous studies in PF have examined conditional motor learning at the single cell level but not the correlation of discharges between neurons at the ensemble level. In the present study, we recorded from two rhesus monkeys in the dorsolateral and the mediolateral parts of the prefrontal cortex to address the role of correlated firing of simultaneously recorded pairs during conditional motor learning. We trained two rhesus monkeys to associate three stimuli with three response targets, such that each stimulus was mapped to only one response. We recorded the neuronal activity of the same neuron pairs during learning of new associations and with already learned associations. In these tasks after a period of fixation, a visual instruction stimulus appeared centrally and three potential response targets appeared in three positions: right, left, and up from center. We found a higher number of neuron pairs significantly correlated and higher cross-correlation coefficients during stimulus presentation in the new than in the familiar mapping task. These results demonstrate that learning affects the PF neural correlation structure.
Article
Information processing in the rodent hippocampus is fundamentally shaped by internally generated sequences (IGSs), expressed during two different network states: theta sequences, which repeat and reset at the ∼8 Hz theta rhythm associated with active behavior, and punctate sharp wave-ripple (SWR) sequences associated with wakeful rest or slow-wave sleep. A potpourri of diverse functional roles has been proposed for these IGSs, resulting in a fragmented conceptual landscape. Here, we advance a unitary view of IGSs, proposing that they reflect an inferential process that samples a policy from the animal's generative model, supported by hippocampus-specific priors. The same inference affords different cognitive functions when the animal is in distinct dynamical modes, associated with specific functional networks. Theta sequences arise when inference is coupled to the animal's action-perception cycle, supporting online spatial decisions, predictive processing, and episode encoding. SWR sequences arise when the animal is decoupled from the action-perception cycle and may support offline cognitive processing, such as memory consolidation, the prospective simulation of spatial trajectories, and imagination. We discuss the empirical bases of this proposal in relation to rodent studies and highlight how the proposed computational principles can shed light on the mechanisms of future-oriented cognition in humans.
Article
To be successful, the research agenda for a novel control view of cognition should foresee more detailed, computationally specified process models of cognitive operations including higher cognition. These models should cover all domains of cognition, including those cognitive abilities that can be characterized as online interactive loops and detached forms of cognition that depend on internally generated neuronal processing.
Article
This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity.
Article
We discuss how cybernetic principles of feedback control, used to explain sensorimotor behavior, can be extended to provide a foundation for understanding cognition. In particular, we describe behavior as parallel processes of competition and selection among potential action opportunities ('affordances') expressed at multiple levels of abstraction. Adaptive selection among currently available affordances is biased not only by predictions of their immediate outcomes and payoffs but also by predictions of what new affordances they will make available. This allows animals to purposively create new affordances that they can later exploit to achieve high-level goals, resulting in intentional action that links across multiple levels of control. Finally, we discuss how such a 'hierarchical affordance competition' process can be mapped to brain structure.
Article
Full-text available
Embodied Choice considers action performance as a proper part of the decision making process rather than merely as a means to report the decision. The central statement of embodied choice is the existence of bidirectional influences between action and decisions. This implies that for a decision expressed by an action, the action dynamics and its constraints (e.g. current trajectory and kinematics) influence the decision making process. Here we use a perceptual decision making task to compare three types of model: a serial decision-then-action model, a parallel decision-and-action model, and an embodied choice model where the action feeds back into the decision making. The embodied model incorporates two key mechanisms that together are lacking in the other models: action preparation and commitment. First, action preparation strategies alleviate delays in enacting a choice but also modify decision termination. Second, action dynamics change the prospects and create a commitment effect to the initially preferred choice. Our results show that these two mechanisms make embodied choice models better suited to combine decision and action appropriately to achieve suitably fast and accurate responses, as usually required in ecologically valid situations. Moreover, embodied choice models with these mechanisms give a better account of trajectory tracking experiments during decision making. In conclusion, the embodied choice framework offers a combined theory of decision and action that gives a clear case that embodied phenomena such as the dynamics of actions can have a causal influence on central cognition.
Article
Full-text available
Author Summary The Markov Chain Monte Carlo (MCMC) approach to probabilistic inference for a distribution is to draw a sequence of samples from and to carry out computational operations via simple online computations on such a sequence. But such a sequential computational process takes time, and therefore this simple version of the MCMC approach runs into problems when one needs to carry out probabilistic inference for rapidly varying distributions. This difficulty also affects all currently existing models for emulating MCMC sampling by networks of stochastically firing neurons. We show here that by moving to a space-rate approach where salient probabilities are encoded through the spiking activity of ensembles of neurons, rather than by single neurons, this problem can be solved. In this way even theoretically optimal models for dealing with time varying distributions through sequential Monte Carlo sampling, so called particle filters, can be emulated by networks of spiking neurons. Each spike of a neuron in an ensemble represents in this approach a “particle” (or vote) for a particular value of a time-varying random variable. In other words, neural circuits can speed up computations based on Monte Carlo sampling through their inherent parallelism.
Article
Full-text available
One contribution of 18 to a Theme Issue 'The principles of goal-directed decision-making: from neural mechanisms to computation and robotics'. The why, what, where, when and how of goal-directed choice: neuronal and computational principles The central problems that goal-directed animals must solve are: 'What do I need and Why, Where and When can this be obtained, and How do I get it?' or the H4W problem. Here, we elucidate the principles underlying the neuronal solutions to H4W using a combination of neurobiological and neu-rorobotic approaches. First, we analyse H4W from a system-level perspective by mapping its objectives onto the Distributed Adaptive Control embodied cognitive architecture which sees the generation of adaptive action in the real world as the primary task of the brain rather than optimally solving abstract problems. We next map this functional decomposition to the architecture of the rodent brain to test its consistency. Following this approach, we propose that the mammalian brain solves the H4W problem on the basis of multiple kinds of outcome predictions, integrating central representations of needs and drives (e.g. hypothalamus), valence (e.g. amygdala), world, self and task state spaces (e.g. neocortex, hippocampus and prefrontal cortex, respectively) combined with multi-modal selection (e.g. basal ganglia). In our analysis, goal-directed behaviour results from a well-structured architecture in which goals are bootstrapped on the basis of predefined needs, valence and multiple learning, memory and planning mechanisms rather than being generated by a singular computation.
Article
Full-text available
The central problems that goal-directed animals must solve are: 'What do I need and Why, Where and When can this be obtained, and How do I get it?' or the H4W problem. Here, we elucidate the principles underlying the neuronal solutions to H4W using a combination of neurobiological and neurorobotic approaches. First, we analyse H4W from a system-level perspective by mapping its objectives onto the Distributed Adaptive Control embodied cognitive architecture which sees the generation of adaptive action in the real world as the primary task of the brain rather than optimally solving abstract problems. We next map this functional decomposition to the architecture of the rodent brain to test its consistency. Following this approach, we propose that the mammalian brain solves the H4W problem on the basis of multiple kinds of outcome predictions, integrating central representations of needs and drives (e.g. hypothalamus), valence (e.g. amygdala), world, self and task state spaces (e.g. neocortex, hippocampus and prefrontal cortex, respectively) combined with multi-modal selection (e.g. basal ganglia). In our analysis, goal-directed behaviour results from a well-structured architecture in which goals are bootstrapped on the basis of predefined needs, valence and multiple learning, memory and planning mechanisms rather than being generated by a singular computation.
Article
Full-text available
The prefrontal cortex (PFC) subserves reasoning in the service of adaptive behavior. Little is known, however, about the architecture of reasoning processes in the PFC. Using computational modeling and neuroimaging, we show here that the human PFC has two concurrent inferential tracks: (i) one from ventromedial to dorsomedial PFC regions that makes probabilistic inferences about the reliability of the ongoing behavioral strategy and arbitrates between adjusting this strategy versus exploring new ones from long-term memory, and (ii) another from polar to lateral PFC regions that makes probabilistic inferences about the reliability of two or three alternative strategies and arbitrates between exploring new strategies versus exploiting these alternative ones. The two tracks interact and, along with the striatum, realize hypothesis testing for accepting versus rejecting newly created strategies.
Article
Full-text available
Two rhesus monkeys performed a distance discrimination task in which they reported whether a red square or a blue circle had appeared farther from a fixed reference point. Because a new pair of distances was chosen randomly on each trial, and because the monkeys had no opportunity to correct errors, no information from the previous trial was relevant to a current one. Nevertheless, many prefrontal cortex neurons encoded the outcome of the previous trial on current trials. A smaller, intermingled population of cells encoded the spatial goal on the previous trial or the features of the chosen stimuli, such as color or shape. The coding of previous outcomes and goals began at various times during a current trial, and it was selective in that prefrontal cells did not encode other information from the previous trial. The monitoring of previous goals and outcomes often contributes to problem solving, and it can support exploratory behavior. The present results show that such monitoring occurs autonomously and selectively, even when irrelevant to the task at hand.
Article
Full-text available
An enduring and richly elaborated dichotomy in cognitive neuroscience is that of reflective versus reflexive decision making and choice. Other literatures refer to the two ends of what is likely to be a spectrum with terms such as goal-directed versus habitual, model-based versus model-free or prospective versus retrospective. One of the most rigorous traditions of experimental work in the field started with studies in rodents and graduated via human versions and enrichments of those experiments to a current state in which new paradigms are probing and challenging the very heart of the distinction. We review four generations of work in this tradition and provide pointers to the forefront of the field's fifth generation.
Article
Full-text available
This paper considers agency in the setting of embodied or active inference. In brief, we associate a sense of agency with prior beliefs about action and ask what sorts of beliefs underlie optimal behavior. In particular, we consider prior beliefs that action minimizes the Kullback–Leibler (KL) divergence between desired states and attainable states in the future. This allows one to formulate bounded rationality as approximate Bayesian inference that optimizes a free energy bound on model evidence. We show that constructs like expected utility, exploration bonuses, softmax choice rules and optimism bias emerge as natural consequences of this formulation. Previous accounts of active inference have focused on predictive coding and Bayesian filtering schemes for minimizing free energy. Here, we consider variational Bayes as an alternative scheme that provides formal constraints on the computational anatomy of inference and action—constraints that are remarkably consistent with neuroanatomy. Furthermore, this scheme contextualizes optimal decision theory and economic (utilitarian) formulations as pure inference problems. For example, expected utility theory emerges as a special case of free energy minimization, where the sensitivity or inverse temperature (of softmax functions and quantal response equilibria) has a unique and Bayes-optimal solution—that minimizes free energy. This sensitivity corresponds to the precision of beliefs about behavior, such that attainable goals are afforded a higher precision or confidence. In turn, this means that optimal behavior entails a representation of confidence about outcomes that are under an agent's control.
Article
Full-text available
Single-neuron activity in the prefrontal cortex (PFC) is tuned to mixtures of multiple task-related aspects. Such mixed selectivity is highly heterogeneous, seemingly disordered and therefore difficult to interpret. We analysed the neural activity recorded in monkeys during an object sequence memory task to identify a role of mixed selectivity in subserving the cognitive functions ascribed to the PFC. We show that mixed selectivity neurons encode distributed information about all task-relevant aspects. Each aspect can be decoded from the population of neurons even when single-cell selectivity to that aspect is eliminated. Moreover, mixed selectivity offers a significant computational advantage over specialized responses in terms of the repertoire of input-output functions implementable by readout neurons. This advantage originates from the highly diverse nonlinear selectivity to mixtures of task-relevant variables, a signature of high-dimensional neural representations. Crucially, this dimensionality is predictive of animal behaviour as it collapses in error trials. Our findings recommend a shift of focus for future studies from neurons that have easily interpretable response tuning to the widely observed, but rarely analysed, mixed selectivity neurons.
Article
Full-text available
Brains, it has recently been argued, are essentially prediction machines. They are bundles of cells that support perception and action by constantly attempting to match incoming sensory inputs with top-down expectations or predictions. This is achieved using a hierarchical generative model that aims to minimize prediction error within a bidirectional cascade of cortical processing. Such accounts offer a unifying model of perception and action, illuminate the functional role of attention, and may neatly capture the special contribution of cortical processing to adaptive success. This target article critically examines this "hierarchical prediction machine" approach, concluding that it offers the best clue yet to the shape of a unified science of mind and action. Sections 1 and 2 lay out the key elements and implications of the approach. Section 3 explores a variety of pitfalls and challenges, spanning the evidential, the methodological, and the more properly conceptual. The paper ends (sections 4 and 5) by asking how such approaches might impact our more general vision of mind, experience, and agency.
Article
Full-text available
Cognitive flexibility is fundamental to adaptive intelligent behavior. Prefrontal cortex has long been associated with flexible cognitive function, but the neurophysiological principles that enable prefrontal cells to adapt their response properties according to context-dependent rules remain poorly understood. Here, we use time-resolved population-level neural pattern analyses to explore how context is encoded and maintained in primate prefrontal cortex and used in flexible decision making. We show that an instruction cue triggers a rapid series of state transitions before settling into a stable low-activity state. The postcue state is differentially tuned according to the current task-relevant rule. During decision making, the response to a choice stimulus is characterized by an initial stimulus-specific population response but evolves to different final decision-related states depending on the current rule. These results demonstrate how neural tuning profiles in prefrontal cortex adapt to accommodate changes in behavioral context. Highly flexible tuning could be mediated via short-term synaptic plasticity.
Article
Full-text available
Instrumental behavior depends on both goal-directed and habitual mechanisms of choice. Normative views cast these mechanisms in terms of model-free and model-based methods of reinforcement learning, respectively. An influential proposal hypothesizes that model-free and model-based mechanisms coexist and compete in the brain according to their relative uncertainty. In this paper we propose a novel view in which a single Mixed Instrumental Controller produces both goal-directed and habitual behavior by flexibly balancing and combining model-based and model-free computations. The Mixed Instrumental Controller performs a cost-benefits analysis to decide whether to chose an action immediately based on the available "cached" value of actions (linked to model-free mechanisms) or to improve value estimation by mentally simulating the expected outcome values (linked to model-based mechanisms). Since mental simulation entails cognitive effort and increases the reward delay, it is activated only when the associated "Value of Information" exceeds its costs. The model proposes a method to compute the Value of Information, based on the uncertainty of action values and on the distance of alternative cached action values. Overall, the model by default chooses on the basis of lighter model-free estimates, and integrates them with costly model-based predictions only when useful. Mental simulation uses a sampling method to produce reward expectancies, which are used to update the cached value of one or more actions; in turn, this updated value is used for the choice. The key predictions of the model are tested in different settings of a double T-maze scenario. Results are discussed in relation with neurobiological evidence on the hippocampus - ventral striatum circuit in rodents, which has been linked to goal-directed spatial navigation.
Article
Full-text available
Learning and executive functions such as task-switching share common neural substrates, notably prefrontal cortex and basal ganglia. Understanding how they interact requires studying how cognitive control facilitates learning but also how learning provides the (potentially hidden) structure, such as abstract rules or task-sets, needed for cognitive control. We investigate this question from 3 complementary angles. First, we develop a new context-task-set (C-TS) model, inspired by nonparametric Bayesian methods, specifying how the learner might infer hidden structure (hierarchical rules) and decide to reuse or create new structure in novel situations. Second, we develop a neurobiologically explicit network model to assess mechanisms of such structured learning in hierarchical frontal cortex and basal ganglia circuits. We systematically explore the link between these modeling levels across task demands. We find that the network provides an approximate implementation of high-level C-TS computations, with specific neural mechanisms modulating distinct C-TS parameters. Third, this synergism yields predictions about the nature of human optimal and suboptimal choices and response times during learning and task-switching. In particular, the models suggest that participants spontaneously build task-set structure into a learning problem when not cued to do so, which predicts positive and negative transfer in subsequent generalization tests. We provide experimental evidence for these predictions and show that C-TS provides a good quantitative fit to human sequences of choices. These findings implicate a strong tendency to interactively engage cognitive control and learning, resulting in structured abstract representations that afford generalization opportunities and, thus, potentially long-term rather than short-term optimality. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Article
Full-text available
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman's coalescent, Dirichlet diffusion trees and Wishart processes.
Article
Full-text available
In a nonverbal counting task derived from the animal literature, adult human subjects repeatedly attempted to produce target numbers of key presses at rates that made vocal or subvocal counting difficult or impossible. In a second task, they estimated the number of flashes in a rapid, randomly timed sequence. Congruent with the animal data, mean estimates in both tasks were proportional to target values, as was the variability in the estimates. Converging evidence makes it unlikely that subjects used verbal counting or time durations to perform these tasks. The results support the hypothesis that adult humans share with nonverbal animals a system for representing number by magnitudes that have scalar variability (a constant coefficient of variation). The mapping of numerical symbols to mental magnitudes provides a formal model of the underlying nonverbal meaning of the symbols (a model of numerical semantics).
Article
Full-text available
Describes a theory of temporal control which treats responding of animal Ss at asymptote under a variety of learning procedures. Ss are viewed as making estimates of the time to reinforcement delivery using a scalar-timing process, which rescales estimates for different values of the interval being timed. Scalar-timing implies a constant coefficient of variation. Expectancies of reward based on these estimates are formed, and a discrimination between response alternatives is made by taking a ratio of their expectancies. In periodic schedules of reinforcement the discrimination is between local and overall expectancy of reward. In psychophysical studies of duration discrimination, the expectancy ratio reduces the likelihood ratio, and in conjunction with the scalar property, results in a general form of Weber's law. The psychometric choice function describing preference for different amounts and delays of reinforcement also results in a form of Weber's law. (102 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Although the lateral prefrontal cortex (lPFC) and dorsal premotor cortex (PMd) are thought to be involved in goal-directed behavior, the specific roles of each area still remain elusive. To characterize and compare neuronal activity in two sectors of the lPFC [dorsal (dlPFC) and ventral (vlPFC)] and the PMd, we designed a behavioral task for monkeys to explore the differences in their participation in four aspects of information processing: encoding of visual signals, behavioral goal retrieval, action specification, and maintenance of relevant information. We initially presented a visual object (an instruction cue) to instruct a behavioral goal (reaching to the right or left of potential targets). After a subsequent delay, a choice cue appeared at various locations on a screen, and the animals could specify an action to achieve the behavioral goal. We found that vlPFC neurons amply encoded object features of the instruction cues for behavioral goal retrieval and, subsequently, spatial locations of the choice cues for specifying the actions. By contrast, dlPFC and PMd neurons rarely encoded the object features, although they reflected the behavioral goals throughout the delay period. After the appearance of the choice cues, the PMd held information for action throughout the specification and preparation of reaching movements. Remarkably, lPFC neurons represented information for the behavioral goal continuously, even after the action specification as well as during its execution. These results indicate that area-specific representation and information processing at progressive stages of the perception-action transformation in these areas underlie goal-directed behavior.
Article
Full-text available
The frontal lobes subserve decision-making and executive control--that is, the selection and coordination of goal-directed behaviors. Current models of frontal executive function, however, do not explain human decision-making in everyday environments featuring uncertain, changing, and especially open-ended situations. Here, we propose a computational model of human executive function that clarifies this issue. Using behavioral experiments, we show that unlike others, the proposed model predicts human decisions and their variations across individuals in naturalistic situations. The model reveals that for driving action, the human frontal function monitors up to three/four concurrent behavioral strategies and infers online their ability to predict action outcomes: whenever one appears more reliable than unreliable, this strategy is chosen to guide the selection and learning of actions that maximize rewards. Otherwise, a new behavioral strategy is tentatively formed, partly from those stored in long-term memory, then probed, and if competitive confirmed to subsequently drive action. Thus, the human executive function has a monitoring capacity limited to three or four behavioral strategies. This limitation is compensated by the binary structure of executive control that in ambiguous and unknown situations promotes the exploration and creation of new behavioral strategies. The results support a model of human frontal function that integrates reasoning, learning, and creative abilities in the service of decision-making and adaptive behavior.
Article
Full-text available
Recent work has given rise to the view that reward-based decision making is governed by two key controllers: a habit system, which stores stimulus-response associations shaped by past reward, and a goal-oriented system that selects actions based on their anticipated outcomes. The current literature provides a rich body of computational theory addressing habit formation, centering on temporal-difference learning mechanisms. Less progress has been made toward formalizing the processes involved in goal-directed decision making. We draw on recent work in cognitive neuroscience, animal conditioning, cognitive and developmental psychology, and machine learning to outline a new theory of goal-directed decision making. Our basic proposal is that the brain, within an identifiable network of cortical and subcortical structures, implements a probabilistic generative model of reward, and that goal-directed decision making is effected through Bayesian inversion of this model. We present a set of simulations implementing the account, which address benchmark behavioral and neuroscientific findings, and give rise to a set of testable predictions. We also discuss the relationship between the proposed framework and other models of decision making, including recent models of perceptual choice, to which our theory bears a direct connection.
Article
Full-text available
A rational model of human categorization behavior is presented that assumes that categorization reflects the derivation of optimal estimates of the probability of unseen features of objects. A Bayesian analysis is performed of what optimal estimations would be if categories formed a disjoint partitioning of the object space and if features were independently displayed within a category. This Bayesian analysis is placed within an incremental categorization algorithm. The resulting rational model accounts for effects of central tendency of categories, effects of specific instances, learning of linearly nonseparable categories, effects of category labels, extraction of basic level categories, base-rate effects, probability matching in categorization, and trial-by-trial learning functions. Although the rational model considers just 1 level of categorization, it is shown how predictions can be enhanced by considering higher and lower levels. Considering prediction at the lower, individual level allows integration of this rational analysis of categorization with the earlier rational analysis of memory (Anderson & Milson, 1989).
Article
Full-text available
Rules are widely used in everyday life to organize actions and thoughts in accordance with our internal goals. At the simplest level, single rules can be used to link individual sensory stimuli to their appropriate responses. However, most tasks are more complex and require the concurrent application of multiple rules. Experiments on humans and monkeys have shown the involvement of a frontoparietal network in rule representation. Yet, a fundamental issue still needs to be clarified: Is the neural representation of multiple rules compositional, that is, built on the neural representation of their simple constituent rules? Subjects were asked to remember and apply either simple or compound rules. Multivariate decoding analyses were applied to functional magnetic resonance imaging data. Both ventrolateral frontal and lateral parietal cortex were involved in compound representation. Most importantly, we were able to decode the compound rules by training classifiers only on the simple rules they were composed of. This shows that the code used to store rule information in prefrontal cortex is compositional. Compositional coding in rule representation suggests that it might be possible to decode other complex action plans by learning the neural patterns of the known composing elements.
Article
Full-text available
Growing evidence suggests that the prefrontal cortex (PFC) is organized hierarchically, with more anterior regions having increasingly abstract representations. How does this organization support hierarchical cognitive control and the rapid discovery of abstract action rules? We present computational models at different levels of description. A neural circuit model simulates interacting corticostriatal circuits organized hierarchically. In each circuit, the basal ganglia gate frontal actions, with some striatal units gating the inputs to PFC and others gating the outputs to influence response selection. Learning at all of these levels is accomplished via dopaminergic reward prediction error signals in each corticostriatal circuit. This functionality allows the system to exhibit conditional if-then hypothesis testing and to learn rapidly in environments with hierarchical structure. We also develop a hybrid Bayesian-reinforcement learning mixture of experts (MoE) model, which can estimate the most likely hypothesis state of individual participants based on their observed sequence of choices and rewards. This model yields accurate probabilistic estimates about which hypotheses are attended by manipulating attentional states in the generative neural model and recovering them with the MoE model. This 2-pronged modeling approach leads to multiple quantitative predictions that are tested with functional magnetic resonance imaging in the companion paper.
Article
Full-text available
Although the psychophysics of infants' nonsymbolic number representations have been well studied, less is known about other characteristics of the approximate number system (ANS) in young children. Here three experiments explored the extent to which the ANS yields abstract representations by testing infants' ability to transfer approximate number representations across sensory modalities. These experiments showed that 6-month-olds matched the approximate number of sounds they heard to the approximate number of sights they saw, looking longer at visual arrays that numerically mismatched a previously heard auditory sequence. This looking preference was observed when sights and sounds mismatched by 1:3 and 1:2 ratios but not by a 2:3 ratio. These findings suggest that infants can compare numerical information obtained in different modalities using representations stored in memory. Furthermore, the acuity of 6-month-olds' comparisons of intermodal numerical sequences appears to parallel that of their comparisons of unimodal sequences.
Article
Full-text available
Rational models of cognition typically consider the abstract computational problems posed by the environment, assuming that people are capable of optimally solving those problems. This differs from more traditional formal models of cognition, which focus on the psychological processes responsible for behavior. A basic challenge for rational models is thus explaining how optimal solutions can be approximated by psychological processes. We outline a general strategy for answering this question, namely to explore the psychological plausibility of approximation algorithms developed in computer science and statistics. In particular, we argue that Monte Carlo methods provide a source of rational process models that connect optimal solutions to psychological processes. We support this argument through a detailed example, applying this approach to Anderson's (1990, 1991) rational model of categorization (RMC), which involves a particularly challenging computational problem. Drawing on a connection between the RMC and ideas from nonparametric Bayesian statistics, we propose 2 alternative algorithms for approximate inference in this model. The algorithms we consider include Gibbs sampling, a procedure appropriate when all stimuli are presented simultaneously, and particle filters, which sequentially approximate the posterior distribution with a small number of samples that are updated as new data become available. Applying these algorithms to several existing datasets shows that a particle filter with a single particle provides a good description of human inferences.
Article
Full-text available
The ability to group items and events into functional categories is a fundamental characteristic of sophisticated thought. It is subserved by plasticity in many neural systems, including neocortical regions (sensory, prefrontal, parietal, and motor cortex), the medial temporal lobe, the basal ganglia, and midbrain dopaminergic systems. These systems interact during category learning. Corticostriatal loops may mediate recursive, bootstrapping interactions between fast reward-gated plasticity in the basal ganglia and slow reward-shaded plasticity in the cortex. This can provide a balance between acquisition of details of experiences and generalization across them. Interactions between the corticostriatal loops can integrate perceptual, response, and feedback-related aspects of the task and mediate the shift from novice to skilled performance. The basal ganglia and medial temporal lobe interact competitively or cooperatively, depending on the demands of the learning task.
Article
Full-text available
A free-energy principle has been proposed recently that accounts for action, perception and learning. This Review looks at some key brain theories in the biological (for example, neural Darwinism) and physical (for example, information theory and optimal control theory) sciences from the free-energy perspective. Crucially, one key theme runs through each of these theories - optimization. Furthermore, if we look closely at what is optimized, the same quantity keeps emerging, namely value (expected reward, expected utility) or its complement, surprise (prediction error, expected cost). This is the quantity that is optimized under the free-energy principle, which suggests that several global brain theories might be unified within a free-energy framework.
Article
Full-text available
This paper offers a conceptual framework which (re)integrates goal-directed control, motivational processes, and executive functions, and suggests a developmental pathway from situated action to higher level cognition. We first illustrate a basic computational (control-theoretic) model of goal-directed action that makes use of internal modeling. We then show that by adding the problem of selection among multiple action alternatives motivation enters the scene, and that the basic mechanisms of executive functions such as inhibition, the monitoring of progresses, and working memory, are required for this system to work. Further, we elaborate on the idea that the off-line re-enactment of anticipatory mechanisms used for action control gives rise to (embodied) mental simulations, and propose that thinking consists essentially in controlling mental simulations rather than directly controlling behavior and perceptions. We conclude by sketching an evolutionary perspective of this process, proposing that anticipation leveraged cognition, and by highlighting specific predictions of our model.
Article
Full-text available
This paper describes a general model that subsumes many parametric models for continuous data. The model comprises hidden layers of state-space or dynamic causal models, arranged so that the output of one provides input to another. The ensuing hierarchy furnishes a model for many types of data, of arbitrary complexity. Special cases range from the general linear model for static data to generalised convolution models, with system noise, for nonlinear time-series analysis. Crucially, all of these models can be inverted using exactly the same scheme, namely, dynamic expectation maximization. This means that a single model and optimisation scheme can be used to invert a wide range of models. We present the model and a brief review of its inversion to disclose the relationships among, apparently, diverse generative models of empirical data. We then show that this inversion can be formulated as a simple neural network and may provide a useful metaphor for inference and learning in the brain.
Article
Full-text available
The ability to group stimuli into meaningful categories is a fundamental cognitive process. To explore its neural basis, we trained monkeys to categorize computer-generated stimuli as “cats” and “dogs.” A morphing system was used to systematically vary stimulus shape and precisely define the category boundary. Neural activity in the lateral prefrontal cortex reflected the category of visual stimuli, even when a monkey was retrained with the stimuli assigned to new categories.
Article
The classical notion that the cerebellum and the basal ganglia are dedicated to motor control is under dispute given increasing evidence of their involvement in non-motor functions. Is it then impossible to characterize the functions of the cerebellum, the basal ganglia and the cerebral cortex in a simplistic manner? This paper presents a novel view that their computational roles can be characterized not by asking what are the "goals" of their computation, such as motor or sensory, but by asking what are the "methods" of their computation, specifically, their learning algorithms. There is currently enough anatomical, physiological, and theoretical evidence to support the hypotheses that the cerebellum is a specialized organism for supervised learning, the basal ganglia are for reinforcement learning, and the cerebral cortex is for unsupervised learning.This paper investigates how the learning modules specialized for these three kinds of learning can be assembled into goal-oriented behaving systems. In general, supervised learning modules in the cerebellum can be utilized as "internal models" of the environment. Reinforcement learning modul