Article

Neurons in Posterior Cingulate Cortex Signal Exploratory Decisions in a Dynamic Multioption Choice Task

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In dynamic environments, adaptive behavior requires striking a balance between harvesting currently available rewards (exploitation) and gathering information about alternative options (exploration). Such strategic decisions should incorporate not only recent reward history, but also opportunity costs and environmental statistics. Previous neuroimaging and neurophysiological studies have implicated orbitofrontal cortex, anterior cingulate cortex, and ventral striatum in distinguishing between bouts of exploration and exploitation. Nonetheless, the neuronal mechanisms that underlie strategy selection remain poorly understood. We hypothesized that posterior cingulate cortex (CGp), an area linking reward processing, attention, memory, and motor control systems, mediates the integration of variables such as reward, uncertainty, and target location that underlie this dynamic balance. Here we show that CGp neurons distinguish between exploratory and exploitative decisions made by monkeys in a dynamic foraging task. Moreover, firing rates of these neurons predict in graded fashion the strategy most likely to be selected on upcoming trials. This encoding is distinct from switching between targets and is independent of the absolute magnitudes of rewards. These observations implicate CGp in the integration of individual outcomes across decision making and the modification of strategy in dynamic environments.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Value-based decision-making tasks are used to determine the cognitive and neural mechanisms for reward learning and choice. [1][2][3][4] One frequent assumption is that agents make their decisions based on the feature dimension that the experimenter has designed the reward probabilities to vary on. However, in complex, multidimensional environments, stimuli can vary on multiple feature dimensions like identity and location simultaneously, and the features that predict reward outcomes are not always obvious. 5 As a result of this complexity, differences in learning and decision making within and between individuals could result as much from differences in the strategies employed to learn, as they could from the capacity to learn. ...
... To determine whether there are sex differences in the strategies employed during value-based decision making, we trained male and female mice on a two-dimensional decision-making task: a visual bandit. 1,2,4,[24][25][26][27][28] While all animals eventually reached the same performance level, female mice learned more rapidly than males on average. Because choice could vary in two dimensions, 29,30 we asked whether individual animals were adopting different strategies during learning. ...
... Age-matched male and female wild-type mice (n = 32, 16 per sex) were trained to perform a visually cued two-armed bandit task ( Figure 1A). This visually cued task design was similar to those employed in humans and nonhuman primates, 1,2,4,[24][25][26][27][28]31,32 in contrast to the spatial bandit designs frequently employed with rodents. [33][34][35][36] Animals were presented with a repeating set of two different image cues which were each associated with different probabilistic reward outcomes (80%/20%) ( Figure 1B). ...
Article
Full-text available
A frequent assumption in value-based decision-making tasks is that agents make decisions based on the feature dimension that reward probabilities vary on. However, in complex, multidimensional environments, stimuli can vary on multiple dimensions at once, meaning that the feature deserving the most credit for outcomes is not always obvious. As a result, individuals may vary in the strategies used to sample stimuli across dimensions, and these strategies may have an unrecognized influence on decision-making. Sex is a proxy for multiple genetic and endocrine influences on behavior, including how environments are sampled. In this study, we examined the strategies adopted by female and male mice as they learned the value of stimuli that varied in both image and location in a visually cued two-armed bandit, allowing two possible dimensions to learn about. Female mice acquired the correct image-value associations more quickly than male mice, preferring a fundamentally different strategy. Female mice were more likely to constrain their decision-space early in learning by preferentially sampling one location over which images varied. Conversely, male mice were more likely to be inconsistent, changing their choice frequently and responding to the immediate experience of stochastic rewards. Individual strategies were related to sex-biased changes in neuronal activation in early learning. Together, we find that in mice, sex is associated with divergent strategies for sampling and learning about the world, revealing substantial unrecognized variability in the approaches implemented during value-based decision making.
... Electrophysiological recordings in animals implicated PMC neurons in strategic decision making (Pearson, Hayden, Raghavachari, & Platt, 2009), risk assessment (McCoy & Platt, 2005), outcomedependent behavioral modulation (Hayden, Smith, & Platt, 2009), as well as approach-avoidance behavior (Vann, Aggleton, & Maguire, 2009). Neuron spiking activity in the PMC allowed distinguishing whether a monkey would pursue an exploratory or exploitative behavioral strategy during food foraging (Pearson et al., 2009). ...
... Electrophysiological recordings in animals implicated PMC neurons in strategic decision making (Pearson, Hayden, Raghavachari, & Platt, 2009), risk assessment (McCoy & Platt, 2005), outcomedependent behavioral modulation (Hayden, Smith, & Platt, 2009), as well as approach-avoidance behavior (Vann, Aggleton, & Maguire, 2009). Neuron spiking activity in the PMC allowed distinguishing whether a monkey would pursue an exploratory or exploitative behavioral strategy during food foraging (Pearson et al., 2009). Monkeys were shown to correctly assess the amount of riskiness and ambiguity implicated by behavioral decisions, similar to humans (Hayden, Heilbronner, & Platt, 2010). ...
... Monkeys were shown to correctly assess the amount of riskiness and ambiguity implicated by behavioral decisions, similar to humans (Hayden, Heilbronner, & Platt, 2010). Further, single-cell recordings in the monkey PMC demonstrated this brain region's sensitivity to subjective target utility (McCoy & Platt, 2005) and integration across individual decision-making instances (Pearson et al., 2009). This DMN region encoded the preference for or aversion to options with uncertain reward outcomes and its neural spiking activity was more associated with subjectively perceived relevance of a chosen object than by its actual value, based on an "internal currency of value" (McCoy & Platt, 2005). ...
Article
Full-text available
The default mode network (DMN) is believed to subserve the baseline mental activity in humans. Its higher energy consumption compared to other brain networks and its intimate coupling with conscious awareness are both pointing to an unknown overarching function. Many research streams speak in favor of an evolutionarily adaptive role in envisioning experience to anticipate the future. In the present work, we propose a process model that tries to explain how the DMN may implement continuous evaluation and prediction of the environment to guide behavior. The main purpose of DMN activity, we argue, may be described by Markov Decision Processes that optimize action policies via value estimates based through vicarious trial and error. Our formal perspective on DMN function naturally accommodates as special cases previous interpretations based on (1) predictive coding, (2) semantic associations, and (3) a sentinel role. Moreover, this process model for the neural optimization of complex behavior in the DMN offers parsimonious explanations for recent experimental findings in animals and humans.
... In contrast, electrophysiological studies in the non-human primate have focused on the contributions of PCC to cognitive control (Hayden et al., 2010), decision making (McCoy et al., 2003;McCoy and Platt, 2005;Pearson et al., 2009), and value judgement (Hayden et al., 2008;Heilbronner et al., 2011) -conditions thought to anti-correlate with PCC engagement in humans. Together, single unit recordings in the macaque PCC during economic decisions have led to a hypothesized role in strategy selection (Pearson et al., 2009;Pearson et al., 2011;Heilbronner and Platt, 2013), further emphasizing a much more executive function, in apparent contrast to the common focus of human data. ...
... In contrast, electrophysiological studies in the non-human primate have focused on the contributions of PCC to cognitive control (Hayden et al., 2010), decision making (McCoy et al., 2003;McCoy and Platt, 2005;Pearson et al., 2009), and value judgement (Hayden et al., 2008;Heilbronner et al., 2011) -conditions thought to anti-correlate with PCC engagement in humans. Together, single unit recordings in the macaque PCC during economic decisions have led to a hypothesized role in strategy selection (Pearson et al., 2009;Pearson et al., 2011;Heilbronner and Platt, 2013), further emphasizing a much more executive function, in apparent contrast to the common focus of human data. However, studies of human decision making and value judgements consistently observe PCC engagement (Bartra et al., 2013;Clithero and Rangel, 2014;Oldham et al., 2018), yet such findings have received limited attention (compared to anterior cingulate cortex) and therefore limited integration with studies of the DMN (Acikalin et al., 2017). ...
Article
Full-text available
Posterior cingulate cortex (PCC) is an enigmatic region implicated in psychiatric and neurological disease, yet its role in cognition remains unclear. Human studies link PCC to episodic memory and default mode network (DMN), while findings from the non-human primate emphasize executive processes more associated with the cognitive control network (CCN) in humans. We hypothesized this difference reflects an important functional division between dorsal (executive) and ventral (episodic) PCC. To test this, we utilized human intracranial recordings of population and single unit activity targeting dorsal PCC during an alternated executive/episodic processing task. Dorsal PCC population responses were significantly enhanced for executive, compared to episodic, task conditions, consistent with the CCN. Single unit recordings, however, revealed four distinct functional types with unique executive (CCN) or episodic (DMN) response profiles. Our findings provide critical electrophysiological data from human PCC, bridging incongruent views within and across species, furthering our understanding of PCC function.
... Rhesus macaques' inferotemporal cortical neurons respond more strongly to images presented in an unexpected order [20,21]. Further, macaques' behaviour demonstrates they will sacrifice liquid reward in exchange for information with no strategic benefit [22,23] and engage in directed exploration [24,25]. These data raise the possibility that strategic information-seeking patterns may reflect an evolutionarily ancient capacity for adaptive regulation of incoming information. ...
... First, the macaques we tested had substantial experiences with tasks for which tracking unigram statistics was more relevant (e.g. k-arm bandit tasks) [22,25,28,30]. Second, previous work has demonstrated that macaques possess sensitivity to transitional statistics in other tasks [20,21,[37][38][39]. ...
Article
Normative learning theories dictate that we should preferentially attend to informative sources, but only up to the point that our limited learning systems can process their content. Humans, including infants, show this predicted strategic deployment of attention. Here, we demonstrate that rhesus monkeys, much like humans, attend to events of moderate surprisingness over both more and less surprising events. They do this in the absence of any specific goal or contingent reward, indicating that the behavioural pattern is spontaneous. We suggest this U-shaped attentional preference represents an evolutionarily preserved strategy for guiding intelligent organisms toward material that is maximally useful for learning.
... The copyright holder for this preprint (which this version posted https://doi.org/10.1101https://doi.org/10. /2022 3 In contrast, electrophysiological studies in the non-human primate have focused on the contributions of PCC to cognitive control (Hayden et al., 2010), decision making (McCoy et al., 2003;McCoy and Platt, 2005;Pearson et al., 2009), and value judgement (Hayden et al., 2008;Heilbronner et al., 2011) -conditions thought to anti-correlate with PCC engagement in humans. Together, single-unit recordings in the macaque PCC during economic decisions have led to a hypothesized role in strategy selection (Pearson et al., 2009;Pearson et al., 2011;Heilbronner and Platt, 2013), further emphasizing a much more executive function, in apparent contrast to the common focus of human data. ...
... /2022 3 In contrast, electrophysiological studies in the non-human primate have focused on the contributions of PCC to cognitive control (Hayden et al., 2010), decision making (McCoy et al., 2003;McCoy and Platt, 2005;Pearson et al., 2009), and value judgement (Hayden et al., 2008;Heilbronner et al., 2011) -conditions thought to anti-correlate with PCC engagement in humans. Together, single-unit recordings in the macaque PCC during economic decisions have led to a hypothesized role in strategy selection (Pearson et al., 2009;Pearson et al., 2011;Heilbronner and Platt, 2013), further emphasizing a much more executive function, in apparent contrast to the common focus of human data. However, studies of human decision making and value judgements consistently observe PCC engagement (Bartra et al., 2013;Clithero and Rangel, 2014;Oldham et al., 2018), yet such findings have received limited attention (compared to anterior cingulate cortex) and therefore limited integration with studies of the DMN (Acikalin et al., 2017). ...
Preprint
Posterior cingulate cortex (PCC) is an enigmatic region implicated in psychiatric and neurological disease, yet its role in cognition remains unclear. Human studies link PCC to episodic memory and default mode network (DMN), while findings from the non-human primate emphasize executive processes more associated with the cognitive control network (CCN) in humans. We hypothesized this difference reflects an important functional division between dorsal (executive) and ventral (episodic) PCC. To test this, we utilized human intracranial recordings of population and single unit activity targeting dorsal PCC during an alternated executive/episodic processing task. Dorsal PCC population responses were significantly enhanced for executive, compared to episodic, task conditions, consistent with the CCN. Single unit recordings, however, revealed four distinct functional types with unique executive (CCN) or episodic (DMN) response profiles. Our findings provide critical electrophysiological data from human PCC, bridging incongruent views within and across species, furthering our understanding of PCC function.
... However, recent findings lead to a task-positive view that DMN is actively involved in tasks [1,77,78], situated at the top of a hierarchy through interaction with sensorimotor network in lower hierarchy [56,78]. Furthermore, our interpretation comports with recent studies that suggested that DMN provides top-down predictions over a slower time scale [79][80][81] and receives contemporaneous phasic bottom-up signals [81]. ...
... Signal decreases in PCC/PCu and RSC-Lt in the present study were temporally-precise in relation to REMs and may be related to inhibition of DMN, originating from structures with activation time-locked to REMs (i.e., message passing from low to high hierarchical levels), rather than permissive deactivation originating from DMN (i.e., message passing from high to low hierarchical levels) ( Figure S3). Evidence supporting downward tonic [79][80][81] and upward phasic message passing to and from DMN [81] fit well with the present findings However, correlational studies (including ours) do not establish direction of causal influences. Employing magnetoencephalography (MEG) may support inferences about directed (effective) connectivity between TPN and DMN. ...
Article
Full-text available
System-specific brain responses—time-locked to rapid eye movements (REMs) in sleep—are characteristically widespread, with robust and clear activation in the primary visual cortex and other structures involved in multisensory integration. This pattern suggests that REMs underwrite hierarchical processing of visual information in a time-locked manner, where REMs index the generation and scanning of virtual-world models, through multisensory integration in dreaming—as in awake states. Default mode network (DMN) activity increases during rest and reduces during various tasks including visual perception. The implicit anticorrelation between the DMN and task-positive network (TPN)—that persists in REM sleep—prompted us to focus on DMN responses to temporally-precise REM events. We timed REMs during sleep from the video recordings and quantified the neural correlates of REMs—using functional MRI (fMRI)—in 24 independent studies of 11 healthy participants. A reanalysis of these data revealed that the cortical areas exempt from widespread REM-locked brain activation were restricted to the DMN. Furthermore, our analysis revealed a modest temporally-precise REM-locked decrease—phasic deactivation—in key DMN nodes, in a subset of independent studies. These results are consistent with hierarchical predictive coding; namely, permissive deactivation of DMN at the top of the hierarchy (leading to the widespread cortical activation at lower levels; especially the primary visual cortex). Additional findings indicate REM-locked cerebral vasodilation and suggest putative mechanisms for dream forgetting.
... In particular, it has been implicated in working memory ( Vatansever et al., 2015 ), task switching ( Crittenden et al., 2015 ), attentional shifting ( Arsenault et al., 2018 ), and creative cognition ( Beaty et al., 2016 ). Of particular relevance to the present study, neurons in posterior cingulate -a default mode area -have been implicated in performance monitoring ( Heilbronner and Platt, 2013 ) and exploration ( Pearson et al., 2009 ). There is also prior evidence of dynamic interactions between default, frontoparietal, and dorsal attention systems, with the frontoparietal system potentially regulating activity in the other two systems in order to adjust the balance between internally-generated (default) and externally-directed (dorsal attention) processing ( Beaty et al., 2016 ;Dixon et al., 2018Dixon et al., , 2017Smallwood et al., 2012 ). ...
... Additionally, we cannot separate effects of exploration from more general effects of attentional shifting. While LC-NE-linked effects on attentional processes are well-known and in some sense are partly constitutive of its influence on exploratory state ( Aston-Jones and Cohen, 2005 ;Corbetta et al., 2008 ;McGinley et al., 2015 ;Sara and Bouret, 2012 ), exploration has been isolated from switching at the single-neuron level ( Pearson et al., 2009 ), so it will be important to better delineate the boundaries of these different processes and states in the future. ...
Article
Full-text available
There is growing interest in how neuromodulators shape brain networks. Recent neuroimaging studies provide evidence that brainstem arousal systems, such as the locus coeruleus-norepinephrine system (LC-NE), influence functional connectivity and brain network topology, suggesting they have a role in flexibly reconfiguring brain networks in order to adapt behavior and cognition to environmental demands. To date, however, the relationship between brainstem arousal systems and functional connectivity has not been assessed within the context of a task with an established relationship between arousal and behavior, with most prior studies relying on incidental variations in arousal or pharmacological manipulation and static brain networks constructed over long periods of time. These factors have likely contributed to a heterogeneity of effects across studies. To address these issues, we took advantage of the association between LC-NE-linked arousal and exploration to probe the relationships between exploratory choice, arousal—as measured indirectly via pupil diameter—and brain network dynamics. Exploration in a bandit task was associated with a shift toward fewer, more weakly connected modules that were more segregated in terms of connectivity and topology but more integrated with respect to the diversity of cognitive systems represented in each module. Functional connectivity strength decreased, and changes in connectivity were correlated with changes in pupil diameter, in line with the hypothesis that brainstem arousal systems influence the dynamic reorganization of brain networks. More broadly, we argue that carefully aligning dynamic network analyses with task designs can increase the temporal resolution at which behaviorally- and cognitively-relevant modulations can be identified, and offer these results as a proof of concept of this approach.
... Participants provided written informed consent and this protocol was approved by Duke University's and University of Arkansas for Medical Sciences' Institutional Review Boards. 6-Armed bandit task (6ABT) This version of the "restless bandit" task was adapted from previous studies [39][40][41][42][43][44][45] and has been published previously [46]. On each trial, six bandit options were depicted on a computer screen and participants selected one to play by pressing a corresponding number on the keypad. ...
... Modeling of the bandit task Choices made in the bandit task were classified as exploratory or exploitative according to model-based account of participants' individual choices (previously described in [39,44,46,49]). Four reinforcement learning models, which each calculate the estimated bandit option pay-offs differently, were initially fit to the participants' data and compared using the Bayesian Information Criterion (BIC). The BIC is a test of the efficiency of the reinforcing learning model for predicting the data (smaller values represent better fit). ...
Article
The ability to maximize rewards and minimize the costs of obtaining them is vital to making advantageous explore/exploit decisions. Exploratory decisions are theorized to be greater among individuals with attention-deficit/hyperactivity disorder (ADHD), potentially due to deficient catecholamine transmission. Here, we examined the effects of ADHD status and methylphenidate, a common ADHD medication, on explore/exploit decisions using a 6-armed bandit task. We hypothesized that ADHD participants would make more exploratory decisions than controls, and that MPH would reduce group differences. On separate study days, adults with (n = 26) and without (n = 23) ADHD completed the bandit task at baseline, and after methylphenidate or placebo in counter-balanced order. Explore/exploit decisions were modeled using reinforcement learning algorithms. ADHD participants made more exploratory decisions (i.e., chose options without the highest expected reward value) and earned fewer points than controls in all three study days, and methylphenidate did not affect these outcomes. Baseline exploratory choices were positively associated with hyperactive ADHD symptoms across all participants. These results support several theoretical models of increased exploratory choices in ADHD and suggest the unexplained variance in ADHD decisions may be due to less value tracking. The inability to suppress actions with little to no reward value may be a key feature of hyperactive ADHD symptoms.
... 42 Here, we examine switching-related brain activity at a fine grain using two novel IBL 43 tasks, which were deployed in separate studies. In the first study (N= 16), we sought to 44 explore the role of the DMN in switching by manipulating the reconfiguration demands 45 along three broad dimensions: (1) the visual-perceptual distance of the switches, (2) 46 the discriminability of the stimuli in the switched to rule set and (3) the predictability 47 of when switches would occur within the sequence of task events. In the second study 48 (N= 16), we examined the relationship between predictability of switching and DMN 49 activity in greater detail. ...
... Single-243 neuron recordings from these regions in rhesus macaques have shown suppression of 244 activity when they had to switch tasks guided by an explicit signal [31], as well as when 245 the animals were engaged on a task [30]. Conversely, when the animals had to perform 246 a dynamic foraging task without exogenous cues, increases in neuronal firing predicted 247 behavioural shifts [46]. Internally guided and predictable switches may be considered 248 analogous, as both afford the opportunity to internally prepare for the switch occurring 249 at a particular point in time. ...
Preprint
Full-text available
The default-mode network (DMN) has been primarily associated with internally-directed and self-relevant cognition. This perspective is expanding to recognise its importance in executive behaviours like switching. We investigated the effect different task-switching manipulations have on DMN activation in two studies with novel fMRI paradigms. In the first study, the paradigm manipulated visual discriminability, visuo-perceptual distance and sequential predictability during switching. Increased posterior cingulate/precuneus (PCC/PrCC) activity was evident during switching; critically, this was strongest when the occurrence of the switch was predictable. In the second study, we sought to replicate and further investigate this switch-related effect with a fully factorial design manipulating sequential, spatial and visual-feature predictability. Whole-brain analysis again identified a PCC/PrCC-centred cluster that was more active for sequentially predictable versus unpredictable switches, but not for the other predictability dimensions. We propose PCC/PrCC DMN subregions may play a prominent executive role in mapping the sequential structure of complex tasks.
... In particular, it has been implicated in working memory (Vatansever et al., 2015), task switching (Crittenden et al., 2015), attentional shifting (Arsenault, Caspari, Vandenberghe, & Vanduffel, 2017), and creative cognition (Beaty, Benedek, Silvia, & Schacter, 2016). Of particular relevance to the present study, neurons in posterior cingulate-a key DMN node-have been implicated in performance monitoring (Heilbronner & Platt, 2013) and exploration (Pearson, Hayden, Raghavachari, & Platt, 2009). There is also prior evidence of dynamic interactions between default, frontoparietal, and dorsal attention systems, with the frontoparietal network potentially regulating activity in the other two networks in order to adjust the balance between internally-generated (default) and externally-directed (dorsal attention) processing (Beaty et al., 2016;Dixon et al., 2017Dixon et al., , 2018Smallwood, Brown, Baird, & Schooler, 2012). ...
... we cannot separate effects of exploration from more general effects of attentional shifting. While LC-NE-linked effects on attentional processes are well-known and in some sense are partly constitutive of its influence on exploratory state (Aston-Jones & Cohen, 2005;Corbetta, Patel, & Shulman, 2008;McGinley et al., 2015;Sara & Bouret, 2012), exploration has been isolated from switching at the single-neuron level (Pearson et al., 2009), so it will be important to better delineate the boundaries of these different processes/states in the future. ...
Article
In order to adapt to changing and uncertain environments, humans and other organisms must balance stability and flexibility in learning and behavior. Stability is necessary to learn environmental regularities and support ongoing behavior, while flexibility is necessary when beliefs need to be revised or behavioral strategies need to be changed. Adjusting the balance between stability and flexibility must often be based on endogenously generated decisions that are informed by information from the environment but not dictated explicitly. This dissertation examines the neurobiological bases of such endogenous flexibility, focusing in particular on the role of prefrontally-mediated cognitive control processes and the neuromodulatory actions of dopaminergic and noradrenergic systems. In the first study (Chapter 2), we examined the role of frontostriatal circuits in instructed reinforcement learning. In this paradigm, inaccurate instructions are given prior to trial-and-error learning, leading to bias in learning and choice. Abandoning the instructions thus necessitates flexibility. We utilized transcranial direct current stimulation over dorsolateral prefrontal cortex to try to establish a causal role for this area in this bias. We also assayed two genes, the COMT Val158Met genetic polymorphism and the DAT1/SLC6A3 variable number tandem repeat, which affect prefrontal and striatal dopamine, respectively. The results support the role of prefrontal cortex in biasing learning, and provide further evidence that individual differences in the balance between prefrontal and striatal dopamine may be particularly important in the tradeoff between stability and flexibility. In the second study (Chapter 3), we assess the neurobiological mechanisms of stability and flexibility in the context of exploration, utilizing fMRI to examine dynamic changes in functional brain networks associated with exploratory choices. We then relate those changes to changes in norepinephrine activity, as measured indirectly via pupil diameter. We find tentative support for the hypothesis that increased norepinephrine activity around exploration facilitates the reorganization of functional brain networks, potentially providing a substrate for flexible exploratory states. Together, this work provides further support for the framework that stability and flexibility entail both costs and benefits, and that optimizing the balance between the two involves interactions of learning and cognitive control systems under the influence of catecholamines.
... In changing environments, decision-makers balance the exploitation of valuable strategies with exploration. That is, they occasionally deviate from previous rules in order to sample alternative options and learn about the environment [7][8][9][10][11][12]. In some algorithms for exploration, the decision to explore is gated by uncertainty about the correct action [9,11,13]. ...
... following a rule) and one associated with rapid samples across feature dimensions with the same half-life as random choices. The discretization of the latent goal states differentiates the HMM from other models, such as a Kalman filter or reinforcement learning models [10,13,51], which would assume some continuous latent state space. However, rules in this task are discrete by design and behavior was well-described by a mixture of discrete states (S3 Fig). ...
Article
Full-text available
In many cognitive tasks, lapses (spontaneous errors) are tacitly dismissed as the result of nuisance processes like sensorimotor noise, fatigue, or disengagement. However, some lapses could also be caused by exploratory noise: randomness in behavior that facilitates learning in changing environments. If so, then strategic processes would need only up-regulate (rather than generate) exploration to adapt to a changing environment. This view predicts that more frequent lapses should be associated with greater flexibility because these behaviors share a common cause. Here, we report that when rhesus macaques performed a set-shifting task, lapse rates were negatively correlated with perseverative error frequency across sessions, consistent with a common basis in exploration. The results could not be explained by local failures to learn. Furthermore, chronic exposure to cocaine, which is known to impair cognitive flexibility, did increase perseverative errors, but, surprisingly, also improved overall set-shifting task performance by reducing lapse rates. We reconcile these results with a state-switching model in which cocaine decreases exploration by deepening attractor basins corresponding to rule states. These results support the idea that exploratory noise contributes to lapses, affecting rule-based decision-making even when it has no strategic value, and suggest that one key mechanism for regulating exploration may be the depth of rule states.
... Our analysis revealed that explicit knowledge of the rotation in the EL task was associated with greater functional connectivity between somatomotor cortices and, most consistently, regions in the limbic and control B/C networks. The latter control network comprises the posterior cingulate cortex, which has been implicated in the top-down selection of response strategies spanning multiple trials (Pearson et al., 2009(Pearson et al., , 2011; and the precuneus, which has been implicated in the voluntary deployment of spatial attention (Ogiso et al., 2000;Wenderoth et al., 2005;Krumbholz et al., 2009). Control network B prominently includes the dorsolateral prefrontal cortex, which in particular has been implicated in the use of spatial information for action selection (see Hoshi, 2006, for a review). ...
Preprint
Full-text available
A bstract Motor learning is supported by multiple systems adapted to processing different forms of sensory information (e.g., reward versus error feedback), and by higher-order systems supporting strategic processes. Yet, the extent to which these systems recruit shared versus separate neural pathways is poorly understood. To elucidate these pathways, we separately studied error-based (EL) and reinforcement-based (RL) motor learning in two functional MRI experiments in the same human subjects. We find that EL and RL occupy opposite ends of neural axis broadly separating cerebellar and striatal connectivity, respectively, with somatomotor cortex, and that alignment of this axis to each task is related to performance. Further, we identify a separate neural axis that is associated with strategy use during EL, and show that the expression of this same axis during RL predicts better performance. Together, these results offer a macroscale view of the common versus distinct neural architectures supporting different learning systems.
... This region, located in the posteromedial cortex, has not received the same amount of scholarly scrutiny from decision neuroscientists as cOFC. Nevertheless, the PCC has a confirmed spatial repertoire [28][29][30][31][32] and plays a fundamental economic role 29,[33][34][35][36][37] . That is, while PCC has consistent responses to outcomes, those responses are spatially selective, perhaps due to the strong interactions between this region and the parietal cortex and medial temporal lobes [38][39][40] . ...
Article
Full-text available
Economic choice requires many cognitive subprocesses, including stimulus detection, valuation, motor output, and outcome monitoring; many of these subprocesses are associated with the central orbitofrontal cortex (cOFC). Prior work has largely assumed that the cOFC is a single region with a single function. Here, we challenge that unified view with convergent anatomical and physiological results from rhesus macaques. Anatomically, we show that the cOFC can be subdivided according to its much stronger (medial) or weaker (lateral) bidirectional anatomical connectivity with the posterior cingulate cortex (PCC). We call these subregions cOFCm and cOFCl, respectively. These two subregions have notable functional differences. Specifically, cOFCm shows enhanced functional connectivity with PCC, as indicated by both spike-field coherence and mutual information. The cOFCm-PCC circuit, but not the cOFCl-PCC circuit, shows signatures of relaying choice signals from a non-spatial comparison framework to a spatially framed organization and shows a putative bidirectional mutually excitatory pattern.
... Although our approach identified VIP ACC with clear stimulus preferences, the function of this interneuron type during anxiogenic environments and social interactions may be more complex than simply encoding specific stimuli. For example, recordings in the rodent PFC and ACC during various tasks and learning paradigms have demonstrated that the activity of some neurons correlates to higher-order processes related to cognitive function, including learning rules, generalization, effort, and goaldirected behavior [2,5,6,13,14,[67][68][69][70][71][72][73][74][75][76][77]. Additionally, an openselective cell, for example, may not necessarily encode the aversiveness of the environment, but other sensorimotor or affective aspects that are associated with the behavior. ...
Article
Full-text available
A hallmark of the anterior cingulate cortex (ACC) is its functional heterogeneity. Functional and imaging studies revealed its importance in the encoding of anxiety-related and social stimuli, but it is unknown how microcircuits within the ACC encode these distinct stimuli. One type of inhibitory interneuron, which is positive for vasoactive intestinal peptide (VIP), is known to modulate the activity of pyramidal cells in local microcircuits, but it is unknown whether VIP cells in the ACC (VIPACC) are engaged by particular contexts or stimuli. Additionally, recent studies demonstrated that neuronal representations in other cortical areas can change over time at the level of the individual neuron. However, it is not known whether stimulus representations in the ACC remain stable over time. Using in vivo Ca²⁺ imaging and miniscopes in freely behaving mice to monitor neuronal activity with cellular resolution, we identified individual VIPACC that preferentially activated to distinct stimuli across diverse tasks. Importantly, although the population-level activity of the VIPACC remained stable across trials, the stimulus-selectivity of individual interneurons changed rapidly. These findings demonstrate marked functional heterogeneity and instability within interneuron populations in the ACC. This work contributes to our understanding of how the cortex encodes information across diverse contexts and provides insight into the complexity of neural processes involved in anxiety and social behavior.
... Le principe de base de ce modèle est que les choix découlent d'un processus d'accumulation déterminé par la différence entre les valeurs d'options A et B, avec un taux μ = ΔV (ΔV = VA-VB est la différence entre les valeurs), (Ratcliff, 1978;Ratcliff & McKoon, 2008). Ce modèle ne peut être mis en oeuvre qu'avec deux options alternatives (A et B) et a été largement appliqué pour étudier les choix lors de prise de décision perceptuelle et mnésique (Kinchla & Smyzer, 1967;Link & Heath, 1975;Pearson, Hayden, Raghavachari, & Platt, 2009). ...
... We produce multiple versions of each class of model, in order to assess how they could be reliably distinguished in behavioral or electrophysiological data [4,21,22]. Our simulations suggest that any observation of strongly reduced sampling-bout durations at one stimulus following the sampling of a highly hedonic stimulus would indicate animals operating with the "entice to stay" class of network. ...
Article
Full-text available
Decisions as to whether to continue with an ongoing activity or to switch to an alternative are a constant in an animal's natural world, and in particular underlie foraging behavior and performance in food preference tests. Stimuli experienced by the animal both impact the choice and are themselves impacted by the choice, in a dynamic back and forth. Here, we present model neural circuits, based on spiking neurons, in which the choice to switch away from ongoing behavior instantiates this back and forth, arising as a state transition in neural activity. We analyze two classes of circuit, which differ in whether state transitions result from a loss of hedonic input from the stimulus (an "entice to stay" model) or from aversive stimulus-input (a "repel to leave" model). In both classes of model, we find that the mean time spent sampling a stimulus decreases with increasing value of the alternative stimulus, a fact that we linked to the inclusion of depressing synapses in our model. The competitive interaction is much greater in "entice to stay" model networks, which has qualitative features of the marginal value theorem, and thereby provides a framework for optimal foraging behavior. We offer suggestions as to how our models could be discriminatively tested through the analysis of electrophysiological and behavioral data.
... humans, monkeys can seek information for its inherent value. For example, macaques will 49 sacrifice some liquid reward in exchange for information with no strategic benefit [20,21] and 50 engage in directed exploration [22,23]. These data raise the possibility that strategic 51 ...
Preprint
Normative learning theories dictate that we should preferentially attend to informative sources, but only up to the point that our limited learning systems can process their content. Humans, including infants, show this predicted strategic deployment of attention. Here we demonstrate that rhesus monkeys, much like humans, attend to events of moderate surprisingness over both more and less surprising events. They do this in the absence of any specific goal or contingent reward, indicating that the behavioral pattern is spontaneous. We suggest this U-shaped attentional preference represents an evolutionarily preserved strategy for guiding intelligent organisms toward material that is maximally useful for learning.
... In this way, neural patterns across the DMN may provide information regarding the degree to which specific brain contexts are predictable, a metric that would be useful, for example, in shifting between exploratory and exploitative modes of foraging behaviour 111 . Consistent with this view, studies in non-human primates suggest that neurons in the PMC help map the explorationexploitation trade-off 112 . In addition, studies of reinforcement learning, which can be readily characterized by prediction error models 113 , identify activity within medial prefrontal regions of the DMN 114 . ...
Article
Full-text available
The default mode network (DMN) is a set of widely distributed brain regions in the parietal, temporal and frontal cortex. These regions often show reductions in activity during attention-demanding tasks but increase their activity across multiple forms of complex cognition, many of which are linked to memory or abstract thought. Within the cortex, the DMN has been shown to be located in regions furthest away from those contributing to sensory and motor systems. Here, we consider how our knowledge of the topographic characteristics of the DMN can be leveraged to better understand how this network contributes to cognition and behaviour. Regions of the default mode network (DMN) are distributed across the brain and show patterns of activity that have linked them to various different functional domains. In this Perspective, Smallwood and colleagues consider how an examination of the topographic characteristics of the DMN can shed light on its contribution to cognition.
... Moreover, enhanced striatal activity was demonstrated in unmedicated patients (Liu et al., 2017), while no differences in striatum activity were found between medicated (Chase et al., 2013) or unmedicated (Rothkirch et al., 2017) (Berman et al., 2011) and schizophrenia (Holt et al., 2011), respectively. Multiple functional roles of the PCC have been attributed to internally-directed cognition (Buckner, Andrews-Hanna, & Schacter, 2008;Raichle et al., 2001), conscious awareness (Cavanna, 2007;Vogt & Laureys, 2005), mediation between internal and external states (Mesulam, 1998), as well as change detection (Hayden, Nair, McCoy, & Platt, 2008;Pearson, Hayden, Raghavachari, & Platt, 2009;Pearson, Heilbronner, Barack, Hayden, & Platt, 2011). More recently a dynamic systems approach was proposed (Leech & Sharp, 2014), which suggests that depending on how broad or narrowly focused and how internally or externally driven the attentional state, the activation or deactivation of the PCC may signal connecting regions associated with other networks. ...
Article
Full-text available
To make adaptive decisions under uncertainty, individuals need to actively monitor the discrepancy between expected outcomes and actual outcomes, known as prediction errors. Reward‐based learning deficits have been shown in both depression and schizophrenia patients. For this study, we compiled studies that investigated prediction error processing in depression and schizophrenia patients and performed a series of meta‐analyses. In both groups, positive t‐maps of prediction error tend to yield striatum activity across studies. The analysis of negative t‐maps of prediction error revealed two large clusters within the right superior and inferior frontal lobes in schizophrenia and the medial prefrontal cortex and bilateral insula in depression. The concordant posterior cingulate activity was observed in both patient groups, more prominent in the depression group and absent in the healthy control group. These findings suggest a possible role in dopamine‐rich areas associated with the encoding of prediction errors in depression and schizophrenia. Positive and negative PE in depression, schizophrenia, and healthy controls.
... 22 observed in the current study is associated with the dynamic modulation of the width of the attentional focus and the adjustment of behavioural strategies after trial-based feedback (Hayden et al. 2008;Pearson et al. 2009;Leech and Sharp 2014). For the SFG with its motorassociated areas, previous studies have already suggested that these areas are not only involved in response selection and motor response preparation, but are also relevant for stimulus-related processes. ...
Article
Full-text available
To respond as quickly as possible in a given task is a widely used instruction in cognitive neuroscience, however, the neural processes modulated by this common experimental procedure remain largely elusive. We investigated the underlying neurophysiological processes combining EEG signal decomposition (residue iteration decomposition, RIDE) and source localization. We show that trial-based response speed instructions enhance behavioural performance in conflicting trials, but slightly impair performance in non-conflicting trials. The modulation seen in conflicting trials was found at several coding levels in EEG data using RIDE. In the S-cluster N2 time window, this modulation was associated with modulated activation in the posterior cingulate cortex and the superior frontal gyrus. Further, in the C-cluster P3 time window, this modulation was associated with modulated activation in the middle frontal gyrus. Interestingly, in the R-cluster P3 time window this modulation was strongest according to statistical effect sizes, associated with modulated activity in the primary motor cortex. Reaction-time feedback mainly modulates response motor execution processes, while attentional and response selection processes are less affected. The study underlines the importance of being aware of how experimental instructions influence the behaviour and neurophysiological processes.
... However, it is not clear that we would have expected these neural results from what we know about exploration. It is true that some previous studies have reported higher net activity during exploration compared to exploitation [82,83], but these studies conflated exploration with errors of task performance. Other studies, using different methods, do not report a net change in neural activity with exploration in decision-making regions, much less the kinds of protracted effects we show here [84][85][86][87]. ...
Article
Full-text available
We have the capacity to follow arbitrary stimulus–response rules, meaning simple policies that guide our behavior. Rule identity is broadly encoded across decision-making circuits, but there are less data on how rules shape the computations that lead to choices. One idea is that rules could simplify these computations. When we follow a rule, there is no need to encode or compute information that is irrelevant to the current rule, which could reduce the metabolic or energetic demands of decision-making. However, it is not clear if the brain can actually take advantage of this computational simplicity. To test this idea, we recorded from neurons in 3 regions linked to decision-making, the orbitofrontal cortex (OFC), ventral striatum (VS), and dorsal striatum (DS), while macaques performed a rule-based decision-making task. Rule-based decisions were identified via modeling rules as the latent causes of decisions. This left us with a set of physically identical choices that maximized reward and information, but could not be explained by simple stimulus–response rules. Contrasting rule-based choices with these residual choices revealed that following rules (1) decreased the energetic cost of decision-making; and (2) expanded rule-relevant coding dimensions and compressed rule-irrelevant ones. Together, these results suggest that we use rules, in part, because they reduce the costs of decision-making through a distributed representational warping in decision-making circuits.
... guess" for the range of tonic dopamine. Recordings of primate midbrain dopaminergic neurons show a firing rate increase of ~40% when reward delivery is random (Fiorillo, Tobler, and Schultz 2003). Dopamine efflux into the NAc as measured by microdialysis in rats making forced high risk choice increase in to very similar levels (St Onge et al. 2012). Pearson et. al., (Pearson et al., 2009 also identified neurons in posterior cingulate cortex which increase their background firing rate from ~4 to 6Hz during explorative decision making. On the basis of these data we assumed that any increase in tonic firing rate and corresponding striatal dopamine was likely to be relatively modest and chose an initial value for . CC-BY-NC ...
Preprint
Full-text available
To make optimal decisions in uncertain circumstances flexible adaption of behaviour is required; exploring alternatives when the best choice is unknown, exploiting what is known when that is best. Using a detailed computational model of the basal ganglia, we propose that switches between exploratory and exploitative decisions can be mediated by the interaction between tonic dopamine and cortical input to the basal ganglia. We show that a biologically detailed action selection circuit model of the basal ganglia, endowed with dopamine dependant striatal plasticity, can optimally solve the explore-exploit problem, estimating the true underlying state of a noisy Gaussian diffusion process. Critical to the model’s performance was a fluctuating level of tonic dopamine which increased under conditions of uncertainty. With an optimal range of tonic dopamine, explore-exploit decision making was mediated by the effects of tonic dopamine on the precision of the model action selection mechanism. Under conditions of uncertain reward pay-out, the model’s reduced selectivity allowed disinhibition of multiple alternative actions to be explored at random. Conversely, when uncertainly about reward pay-out was low, enhanced selectivity of the action selection circuit was enhanced, facilitating exploitation of the high value choice. When integrated with phasic dopamine dependant influences on cortico-striatal plasticity, the model’s performance was at the level of the Kalman filter which provides an optimal solution for the task. Our model provides an integrative account of the relationship between phasic and tonic dopamine and the action selection function of the basal ganglia and supports the idea that this subcortical neural circuit may have evolved to facilitate decision making in non-stationary reward environments, allowing a number of experimental predictions with relevance to abnormal decision making in neuropsychiatric and neurological disease.
... Another related possibility is that DMN activity is more closely associated with an exploratory mode of cognition, whereas default suppression is more closely associated with an exploitative mode. In turn, the PCC might mediate shifts between these two modes (Pearson, Hayden, Raghavachari, & Platt, 2009;Pearson, Heilbronner, Barack, Hayden, & Platt, 2011). ...
Preprint
Full-text available
Neuroeconomics is the study of the neurobiological bases of subjective preferences and choices. We present a novel framework that synthesizes findings from the literatures on neuroeconomics and creativity to provide a neurobiological description of creative cognition. It proposes that value-based decision-making processes and activity in the locus coeruleus-norepinephrine (LC-NE) neuromodulatory system underlie creative cognition, as well as the large-scale brain network dynamics shown to be associated with creativity. This framework allows us to re-conceptualize creative cognition as driven by value-based decision making, in the process providing several falsifiable hypotheses that can further our understanding of creativity, decision making, and brain network dynamics.
... The role of uncertainty-guided exploration has come to occupy an increasingly important place in theories of reinforcement learning (Gershman & Niv, 2015;Knox, Otto, Stone, & Love, 2011;Navarro, Newell, & Schulze, 2016;Payzan-LeNestour & Bossaerts, 2011;Pearson, Hayden, Raghavachari, & Platt, 2009;Schulz et al., 2015;Zhang & Yu, 2013), superseding earlier models of exploratory choice based on a fixed source of decision noise, as in -greedy and softmax policies (e.g., Daw et al., 2006). This shift has been accompanied by a deeper understanding of how reinforcement learning circuits in the basal ganglia compute, represent, and transmit uncertainty to downstream decision making circuits (Gershman, 2017;Lak, Nomoto, Keramati, Sakagami, & Kepecs, 2017;Starkweather, Babayan, Uchida, & Gershman, 2017). ...
Preprint
Full-text available
In order to discover the most rewarding actions, agents must collect information about their environment, potentially foregoing reward. The optimal solution to this “explore-exploit” dilemma is often computationally challenging, but principled algorithmic approximations exist. These approximations utilize uncertainty about action values in different ways. Some random exploration algorithms scale the level of choice stochasticity with the level of uncertainty. Other directed exploration algorithms add a “bonus” to action values with high uncertainty. Random exploration algorithms are sensitive to total uncertainty across actions, whereas directed exploration algorithms are sensitive to relative uncertainty. This paper reports a multi-armed bandit experiment in which total and relative uncertainty were orthogonally manipulated. We found that humans employ both exploration strategies, and that these strategies are independently controlled by different uncertainty computations.
... Importantly, the fronto-parietal control network is situated in-between the fronto-parietal attention and hippocampal-cortical memory networks, making it an important system for integrating information from these networks (Vincent et al., 2008). The mid-cingulate cortex, a structure in the network, has been shown to play a key role in regulating the focus of attention, and the posterior cingulate cortex more generally appears to be involved during the anticipation of an external event (Hayden, Nair, McCoy, & Platt, 2008;Hayden, Smith, & Platt, 2009;Leech & Sharp, 2014;Pearson, Hayden, Raghavachari, & Platt, 2009). Anticipation of an environmental change occurs when the long-term associations that are formed create expectation for perception. ...
Prior knowledge and long‐term memory can guide our attention to facilitate search for and detection of subtle targets embedded in a complex scene. A number of neuropsychological and experimental studies have investigated this effect, yet results in the field remain mixed, as there is a lack of consensus regarding the neural correlates thought to support memory‐guided attention. The purpose of this systematic review and meta‐analysis was to identify a common set of brain structures involved in memory‐guided attention. Statistical analyses were computed on functional magnetic resonance imaging (fMRI) studies that presented participants with a task that required them to detect a target or a change embedded in repeated and novel complex visual displays. After a systematic search, 10 fMRI studies met the selection criteria and were included in the analysis. The results yielded four significant clusters. Activity in right inferior parietal (Brodmann area [BA] 9) and right superior parietal (BA 7) lobes suggests involvement of a fronto‐parietal attention network, while activity in left mid‐cingulate cortex (BA 23) and right middle frontal gyrus (BA 10) suggests involvement of a fronto‐parietal control network. These findings are consistent with the notion that fronto‐parietal circuits are important for interfacing retrieved memories with attentional systems to guide search. This article is categorized under: Psychology > Memory Psychology > Learning Psychology > Attention
... Generators of this effect in the patient sample included the bilateral PCC (BA 33; including left parahippocampal gyrus, BA 12), the right inferior frontal gyrus (rIFG, BA 46), the right inferior parietal gyrus (rIPG, BA 40), and the left superior temporal gyrus (lSTG, BA 22). The PCC has a key function in the default mode network (Raichle et al., 2001) and has generally been related to enhanced recruitment of additional resources during difficult and emotional tasks (Maddock, 1999;Pearson et al., 2009), especially when adaptations to the current model of the world are necessary (i.e., congruent to incongruent condition; Pearson et al., 2011). The left PCC has further been proposed to be an interface between episodic encoding and semantic retrieval (Binder et al., 2009), while the lSTG (BA 22) is known to react to semantic violations (Friederici et al., 2003). ...
Article
Full-text available
Background Neuroscientific models of alcohol use disorders (AUD) postulate an imbalance between automatic, implicit and controlled (conscious) processes. Implicit associations towards alcohol indicate the automatically attributed appeal of alcohol‐related stimuli. First behavioral studies indicate that negative alcohol associations are less pronounced in patients compared to controls, but potential neurophysiological differences remain unexplored. This study investigates neurophysiological correlates of implicit alcohol associations in recently abstinent patients with AUD for the first time, including possible gender effects. Methods A total of 62 patients (40 males, 22 females) and 21 controls performed an alcohol valence Implicit Association Test (IAT), combining alcohol‐related pictures with positive (incongruent condition) or negative (congruent condition) words, while brain activity was recorded using 64‐channel electroencephalography. Event‐related potentials (ERP) for alcohol‐negative and alcohol‐positive trials were computed. Microstate analyses investigated the effects of group (patients, controls) and condition (incongruent, congruent), furthermore, possible gender effects in patients were analyzed. Significant effects were localized with standardized low‐resolution brain electromagnetic topography analysis (sLORETA). Results Although no behavioral group differences were found, ERPs of patients and controls were characterized by distinct microstates from 320ms onwards. ERPs between conditions differed only in patients, with higher signal strength during incongruent trials. Around 600ms controls displayed higher signal strength than patients. A gender effect mirrored this pattern with enhanced signal strength in females as opposed to male patients. Around 690ms, a group by valence interaction indicated enhanced signal strength in congruent compared to incongruent trials, which was more pronounced in controls. Conclusion For patients with AUD, the pattern, timing, and source localization of effects suggest greater effort regarding semantic and self‐relevant integration around 400ms during incongruent trials and attenuated emotional processing during the late positive potential (LPP) timeframe. Interestingly, this emotional attenuation seemed reduced in female patients, thus corroborating the importance of gender‐sensitive research and potentially treatment of AUD.
... In general, the anterior and posterior regions of the cingulate cortex are associated with the detection and monitoring of change or unexpected stimuli (Pearson et al., 2009;Pearson et al., 2011;Apps et al., 2012). Within the context of reward, while the anterior cingulate cortex is involved in the experience of pleasure or happiness (Lindgren et al., 2012;Matsunaga et al., 2016;Rolls et al., 2003;Suardi, Sotgiu, Costa, Cauda, & Rusconi, 2016), and value-guided decision-making (Holroyd & Coles, 2002;Kolling et al., 2016;Shenhav, Cohen, & Botvinick, 2016), the posterior cingulate cortex involves the monitoring of action-reward outcome associations (Hayden, Nair, McCoy, & Platt, 2008;Tabuchi et al., 2005). ...
Article
Full-text available
Functional magnetic resonance imaging (fMRI) studies have shown notable age‐dependent differences in reward processing. We analyzed data from a total of 554 children, 1,059 adolescents, and 1,831 adults from 70 articles. Quantitative meta‐analyses results show that adults engage an extended set of regions that include anterior and posterior cingulate gyri, insula, basal ganglia, and thalamus. Adolescents engage the posterior cingulate and middle frontal gyri as well as the insula and amygdala, whereas children show concordance in right insula and striatal regions almost exclusively. Our data support the notion of reorganization of function over childhood and adolescence and may inform current hypotheses relating to decision‐making across age. For reward processing, adults engage an extended set of regions that include anterior and posterior cingulate gyri, insula, basal ganglia, and thalamus. Adolescents engage the posterior cingulate and middle frontal gyri as well as the insula and amygdala. Children show concordance in right insula and striatal regions almost exclusively.
... The angular gyrus is also heavily implicated in number monitoring (Gö bel et al., 2001) and thus may monitor reward values during exploitation (see Addicott et al., 2014). The PCC is considered to be part of the brain's valuation system and may encode reward-related information during exploitation (Lebreton et al., 2009;Bartra et al., 2013;Grueschow et al., 2015), although in primates PCC neurons were shown to signal exploratory decisions (Pearson et al., 2009). A further characterization of the hypothesized functions of specific subregions of the exploitation-and exploration networks naturally requires direct experimental tests in the future. ...
Article
Full-text available
Involvement of dopamine in regulating exploration during decision-making has long been hypothesized, but direct causal evidence in humans is still lacking. Here, we use a combination of computational modeling, pharmacological intervention and functional magnetic resonance imaging to address this issue. Thirty-one healthy male participants performed a restless four-armed bandit task in a within-subjects design under three drug conditions: 150 mg of the dopamine precursor L-dopa, 2 mg of the D2 receptor antagonist haloperidol, and placebo. Choices were best explained by an extension of an established Bayesian learning model accounting for perseveration, directed exploration and random exploration. Modeling revealed attenuated directed exploration under L-dopa, while neural signatures of exploration, exploitation and prediction error were unaffected. Instead, L-dopa attenuated neural representations of overall uncertainty in insula and dorsal anterior cingulate cortex. Our results highlight the computational role of these regions in exploration and suggest that dopamine modulates how this circuit tracks accumulating uncertainty during decision-making.
... The angular gyrus is also heavily implicated in number monitoring (Gö bel et al., 2001) and thus may monitor reward values during exploitation (see Addicott et al., 2014). The PCC is considered to be part of the brain's valuation system and may encode reward-related information during exploitation (Lebreton et al., 2009;Bartra et al., 2013;Grueschow et al., 2015), although in primates PCC neurons were shown to signal exploratory decisions (Pearson et al., 2009). A further characterization of the hypothesized functions of specific subregions of the exploitation-and exploration networks naturally requires direct experimental tests in the future. ...
Article
Full-text available
Involvement of dopamine in regulating exploration during decision-making has long been hypothesized, but direct causal evidence in humans is still lacking. Here, we use a combination of computational modeling, pharmacological intervention and functional magnetic resonance imaging to address this issue. Thirty-one healthy male participants performed a restless four-armed bandit task in a within-subjects design under three drug conditions: 150 mg of the dopamine precursor L-dopa, 2 mg of the D2 receptor antagonist haloperidol, and placebo. Choices were best explained by an extension of an established Bayesian learning model accounting for perseveration, directed exploration and random exploration. Modeling revealed attenuated directed exploration under L-dopa, while neural signatures of exploration, exploitation and prediction error were unaffected. Instead, L-dopa attenuated neural representations of overall uncertainty in insula and dorsal anterior cingulate cortex. Our results highlight the computational role of these regions in exploration and suggest that dopamine modulates how this circuit tracks accumulating uncertainty during decision-making.
... The angular gyrus is also heavily implicated in number monitoring (Gö bel et al., 2001) and thus may monitor reward values during exploitation (see Addicott et al., 2014). The PCC is considered to be part of the brain's valuation system and may encode reward-related information during exploitation (Lebreton et al., 2009;Bartra et al., 2013;Grueschow et al., 2015), although in primates PCC neurons were shown to signal exploratory decisions (Pearson et al., 2009). A further characterization of the hypothesized functions of specific subregions of the exploitation-and exploration networks naturally requires direct experimental tests in the future. ...
Article
Full-text available
Involvement of dopamine in regulating exploration during decision-making has long been hypothesized, but direct causal evidence in humans is still lacking. Here, we use a combination of computational modeling, pharmacological intervention and functional magnetic resonance imaging to address this issue. Thirty-one healthy male participants performed a restless four-armed bandit task in a within-subjects design under three drug conditions: 150 mg of the dopamine precursor L-dopa, 2 mg of the D2 receptor antagonist haloperidol, and placebo. Choices were best explained by an extension of an established Bayesian learning model accounting for perseveration, directed exploration and random exploration. Modeling revealed attenuated directed exploration under L-dopa, while neural signatures of exploration, exploitation and prediction error were unaffected. Instead, L-dopa attenuated neural representations of overall uncertainty in insula and dorsal anterior cingulate cortex. Our results highlight the computational role of these regions in exploration and suggest that dopamine modulates how this circuit tracks accumulating uncertainty during decision-making.
... longer residence times in safe patches are due to a data censoring effect: perhaps they leave when any individual outcome is lower than some threshold. That is, they may obey a win-stay lose-shift heuristic [72][73][74][75] . To determine if subjects used this heuristic, we examined the likelihood of leaving a risky patch given the recent history of wins and losses. ...
Article
Full-text available
Rhesus macaques (Macaca mulatta) appear to be robustly risk-seeking in computerized gambling tasks typically used for electrophysiology. This behavior distinguishes them from many other animals, which are risk-averse, albeit measured in more naturalistic contexts. We wondered whether macaques’ risk preferences reflect their evolutionary history or derive from the less naturalistic elements of task design associated with the demands of physiological recording. We assessed macaques’ risk attitudes in a task that is somewhat more naturalistic than many that have previously been used: subjects foraged at four feeding stations in a large enclosure. Patches (i.e., stations), provided either stochastically or non-stochastically depleting rewards. Subjects’ patch residence times were longer at safe than at risky stations, indicating a preference for safe options. This preference was not attributable to a win-stay-lose-shift heuristic and reversed as the environmental richness increased. These findings highlight the lability of risk attitudes in macaques and support the hypothesis that the ecological validity of a task can influence the expression of risk preference.
Preprint
The catecholamines dopamine (DA) and norepinephrine (NE) have been repeatedly implicated in neuropsychiatric vulnerability, in part via their roles in mediating the decision making processes. Although the two neuromodulators share a synthesis pathway and are co-activated under states of arousal, they engage in distinct circuits and roles in modulating neural activity across the brain. However, in the computational neuroscience literature, they have been assigned similar roles in modulating the latent cognitive processes of decision making, in particular the exploration-exploitation tradeoff. Revealing how each neuromodulator contributes to this explore-exploit process will be important in guiding mechanistic hypotheses emerging from computational psychiatric approaches. To understand the differences and overlaps of the roles of these two catecholamine systems in regulating exploration and exploitation, a direct comparison using the same dynamic decision making task is needed. Here, we ran mice in a restless two-armed bandit task, which encourages both exploration and exploitation. We systemically administered a nonselective DA receptor antagonist (flupenthixol), a nonselective DA receptor agonist (apomorphine), a NE beta-receptor antagonist (propranolol), and a NE beta-receptor agonist (isoproterenol), and examined changes in exploration within subjects across sessions. We found a bidirectional modulatory effect of dopamine receptor activity on the level of exploration. Increasing dopamine activity decreased exploration and decreasing dopamine activity increased exploration. Beta-noradrenergic receptor activity also modulated exploration, but the modulatory effect was mediated by sex. Reinforcement learning model parameters suggested that dopamine modulation affected exploration via decision noise and norepinephrine modulation affected exploration via outcome sensitivity. Together, these findings suggested that the mechanisms that govern the transition between exploration and exploitation are sensitive to changes in both catecholamine functions and revealed differential roles for NE and DA in mediating exploration.
Article
Sensorimotor learning is a dynamic, systems-level process that involves the combined action of multiple neural systems distributed across the brain. Although much is known about the specialized cortical systems that support specific components of action (such as reaching), we know less about how cortical systems function in a coordinated manner to facilitate adaptive behavior. To address this gap, our study measured human brain activity using functional MRI (fMRI) while participants performed a classic sensorimotor adaptation task and used a manifold learning approach to describe how behavioral changes during adaptation relate to changes in the landscape of cortical activity. During early adaptation, areas in the parietal and premotor cortices exhibited significant contraction along the cortical manifold, which was associated with their increased covariance with regions in the higher-order association cortex, including both the default mode and fronto-parietal networks. By contrast, during Late adaptation, when visuomotor errors had been largely reduced, a significant expansion of the visual cortex along the cortical manifold was associated with its reduced covariance with the association cortex and its increased intraconnectivity. Lastly, individuals who learned more rapidly exhibited greater covariance between regions in the sensorimotor and association cortices during early adaptation. These findings are consistent with a view that sensorimotor adaptation depends on changes in the integration and segregation of neural activity across more specialized regions of the unimodal cortex with regions in the association cortex implicated in higher-order processes. More generally, they lend support to an emerging line of evidence implicating regions of the default mode network (DMN) in task-based performance.
Article
Many challenges in life come without explicit instructions. Instead, humans need to test, select, and adapt their behavioral responses based on feedback from the environment. While reward-centric accounts of feedback processing primarily stress the reinforcing aspect of positive feedback, feedback's central function from an information-processing perspective is to offer an opportunity to correct errors, thus putting a greater emphasis on the informational content of negative feedback. Independent of its potential rewarding value, the informational value of performance feedback has recently been suggested to be neurophysiologically encoded in the dorsal portion of the posterior cingulate cortex (dPCC). To further test this association, we investigated multidimensional categorization and reversal learning by comparing negative and positive feedback in an event-related functional magnetic resonance imaging experiment. Negative feedback, compared with positive feedback, increased activation in the dPCC as well as in brain regions typically involved in error processing. Only in the dPCC, subarea d23, this effect was significantly enhanced in relearning, where negative feedback signaled the need to shift away from a previously established response policy. Together with previous findings, this result contributes to a more fine-grained functional parcellation of PCC subregions and supports the dPCC's involvement in the adaptation to behaviorally relevant information from the environment.
Article
The posterior cingulate cortex (PCC) is one of the least understood regions of the cerebral cortex. By contrast, the anterior cingulate cortex has been the subject of intensive investigation in humans and model animal systems, leading to detailed behavioural and computational theoretical accounts of its function. The time is right for similar progress to be made in the PCC given its unique anatomical and physiological properties and demonstrably important contributions to higher cognitive functions and brain diseases. Here, we describe recent progress in understanding the PCC, with a focus on convergent findings across species and techniques that lay a foundation for establishing a formal theoretical account of its functions. Based on this converging evidence, we propose that the broader PCC region contains three major subregions — the dorsal PCC, ventral PCC and retrosplenial cortex — that respectively support the integration of executive, mnemonic and spatial processing systems. This tripartite subregional view reconciles inconsistencies in prior unitary theories of PCC function and offers promising new avenues for progress. In this Perspective article, Foster and colleagues describe converging evidence supporting an anatomical and functional division of the posterior cingulate cortex into three subregions that contribute to different cognitive tasks.
Thesis
Full-text available
This thesis focuses on the Multisensory Storytelling Project, developed by the research group made by the Department of Cognitive Communication and Action Sciences of the University of Roma Tre, led by Professor Francesco Ferretti. The structure is characterized by three chapters. In the first, the foundations for the study of Cognitive Sciences are recalled, with a particular regard to the idea that stories are considered the human communication specialty, as well as taking into account that they were born to persuade our fellowmen. Narrative turns out to be the first useful brick to construct our communication system. Homo Ergaster, who was prior to Homo Sapiens, was already able to represent reality through stories. Stories require three properties: time, plot and character, each of which is associated with a specific processing system, respectively Mental Time Travel, the Executive Functions and Mindreading. The second chapter deepens spatial navigation, that is closely correlated to the time factor, together with the construction of mental scenarios. Starting from space and time relationship analysis, this thesis presents the Default Mode Network model, presented by Randy L. Buckner and Daniel C. Carroll in an article named «Self-projection and the brain» (2007). It identifies, in self-projection, an element capable of uniting and linking the projective systems. Importantly, a crucial role is played by the retrosplenial complex, that eventually activates while observing and imagining scenarios; in cases of damage, some difficulties could develop in terms of navigation. Considering that sight favors the development of a reality spatial representation, we wonder how blind people transform the tactile experience into the visual one. Ferretti (2008) proposes the idea that blind people could simulate some vision features, by replacing the missing sensory stimulus through the use of some superior processing devices, that deal with visual stimulus analysis. If we assume that mental scenarios construction is involved in the narrative elaboration, a blind person would be deprived of a fundamental part for reconstructing stories. This would complicate the task of storytelling analysis. Starting from this idea, in the third chapter, the existing literature about this matter is briefly explained, being it focused on the pragmatic skills possessed in subjects with visual impairment. In some cases they are compared with children with Autism Spectrum Disorder. Additionally, it is shown how important is the parental figure, which must be able to avoid in children serious long-term consequences, even socially. Next, we describe the Multisensory Storytelling Project. Here, the Research Group's objective is to analyse, in congenital blind subjects aged between 8 and 12 years, whether visual deprivation may be detrimental to narrative and retelling comprehension skills, then re-export, and, subsequently, understand whether a multisensory approach during storytelling may be able to enhance their narrative understanding. We will use the Tangible User Interface, a tool which is based on the multisensory method. It could increase memory capacity and foster learning, bringing benefits in the field of pedagogy, as well as in cases of disability and in children with typical development. This Tangible User Interface prototype was presented by Federica Somma, Lavinia Lattanzio, Raffaele Di Fuccio and Francesco Ferretti (2020) in order to be a supplement for learning and increasing storytelling comprehension. In our Multisensory Storytelling Project, 40 children between the ages of 8 and 12 years will be analysed. Twenty of them will be affected by congenital blindness, guests at the Centro Regionale Sant'Alessio - "Margherita di Savoia" for the Blind, while the other twenty will be at typical development, who will act as a control group. This control group will recruited from the Comprehensive Institutes of the territory. In addition, children must have adequate verbal skills and cognitive skills (such as attention, short-term memory) and the absence of organic or functional disorders in the brain and psychiatric comorbidities. Each child will have in front of him some objects, related to the story that he will hear, on each of which will be affixed an RFID/NFC tag to which will be associated digital information, recorded within the software, which will process the results, developing an output response. Cognitive skills will be evaluated through IQ, Theory of Mind, Mental Time Travel and Mental Space Travel and typhoid indicators, based on the behaviours possessed during interaction with objects, through their manipulation and possible displacements and placements. Our expectations could show that, due to the unintended deprivation of the visual experience, a perceptually different imagination could be developed thanks to a greater sensorial use, such as to try to make up for the lack of sight. Through the verification made by the comprehension test and the task of retelling, the idea is to observe the alterations of the cognitive processes in blind's oral storytelling. Moreover, the aim will be to investigate, using the Tangible User Interface, whether it can be a cognitive "prosthesis" useful to make a better understanding of storytelling in blind people, through the implication of perceptive memories other than visual, such as tactile, olfactory and auditory and, therefore, being useful to improve in them the construction of stories. The research is still under development, waiting for the research protocol to be approved by the ethics committee of the Centro Regionale Sant'Alessio.
Article
Full-text available
Sex-based modulation of cognitive processes could set the stage for individual differences in vulnerability to neuropsychiatric disorders. While value-based decision making processes in particular have been proposed to be influenced by sex differences, the overall correct performance in decision making tasks often show variable or minimal differences across sexes. Computational tools allow us to uncover latent variables that define different decision making approaches, even in animals with similar correct performance. Here, we quantify sex differences in mice in the latent variables underlying behavior in a classic value-based decision making task: a restless 2-armed bandit. While male and female mice had similar accuracy, they achieved this performance via different patterns of exploration. Male mice tended to make more exploratory choices overall, largely because they appeared to get 'stuck' in exploration once they had started. Female mice tended to explore less but learned more quickly during exploration. Together, these results suggest that sex exerts stronger influences on decision making during periods of learning and exploration than during stable choices. Exploration during decision making is altered in people diagnosed with addictions, depression, and neurodevelopmental disabilities, pinpointing the neural mechanisms of exploration as a highly translational avenue for conferring sex-modulated vulnerability to neuropsychiatric diagnoses.
Article
To make optimal decisions in uncertain circumstances flexible adaption of behaviour is required; exploring alternatives when the best choice is unknown, exploiting what is known when that is best. Using a computational model of the basal ganglia, we propose that switches between exploratory and exploitative decisions are mediated by the interaction between tonic dopamine and cortical input to the basal ganglia. We show that a biologically detailed action selection circuit model, endowed with dopamine dependant striatal plasticity, can optimally solve the explore-exploit problem, estimating the true underlying state of a noisy Gaussian diffusion process. Critical to the model’s performance was a fluctuating level of tonic dopamine which increased under conditions of uncertainty. With an optimal range of tonic dopamine, explore-exploit decisions were mediated by the effects of tonic dopamine on the precision of the model action selection mechanism. Under conditions of uncertain reward pay-out, the model’s reduced selectivity allowed disinhibition of multiple alternative actions to be explored at random. Conversely, when uncertainly about reward pay-out was low, enhanced selectivity of the action selection circuit facilitated exploitation of the high value choice. Model performance was at the level of a Kalman filter which provides an optimal solution for the task. These simulations support the idea that this subcortical neural circuit may have evolved to facilitate decision making in non-stationary reward environments. The model generates several experimental predictions with relevance to abnormal decision making in neuropsychiatric and neurological disease.
Article
People with human immunodeficiency virus (HIV) often have neurocognitive impairment. People with HIV make riskier decisions when the outcome probabilities are known, and have abnormal neural architecture underlying risky decision making. However, ambiguous decision making, when the outcome probabilities are unknown, is more common in daily life, but the neural architecture underlying ambiguous decision making in people with HIV is unknown. Eighteen people with HIV and 20 controls completed a decision making task while undergoing functional magnetic resonance imaging scanning. Participants chose between a certain reward and uncertain reward with a known (risky) or unknown (ambiguous) probability of winning. There were three levels of risk: high, medium, and low. Ambiguous > risky brain activity was compared between groups. Ambiguous > risky brain activity was correlated with emotional/psychiatric functioning in people with HIV. Both groups were similarly ambiguity-averse. People with HIV were more risk-averse than controls and chose the high-risk uncertain option less often. People with HIV had hypoactivity in the precuneus, posterior cingulate cortex (PCC), and fusiform gyrus during ambiguous > medium risk decision making. Ambiguous > medium risk brain activity was negatively correlated with emotional/psychiatric functioning in individuals with HIV. To make ambiguous decisions, people with HIV underrecruit key regions of the default mode network, which are thought to integrate internally and externally derived information to come to a decision. These regions and related cognitive processes may be candidates for interventions to improve decision-making outcomes in people with HIV.
Preprint
Full-text available
Decisions as to whether to continue with an ongoing activity or to switch to an alternative are a constant in an animal’s natural world, and in particular underlie foraging behavior and performance in food preference tests. Stimuli experienced by the animal both impact the choice and are themselves impacted by the choice, in a dynamic back and forth. Here, we present model neural circuits, based on spiking neurons, in which the choice to switch away from ongoing behavior instantiates this back and forth, arising as a state transition in neural activity. We analyze two classes of circuit, which differ in whether state transitions result from a loss of hedonic input from the stimulus (an “entice to stay” model) or from aversive stimulus input (a “repel to leave” model). In both classes of model, we find that the mean time spent sampling a stimulus decreases with increasing value of the alternative stimulus, a fact that we linked to the inclusion of depressing synapses in our model. The competitive interaction is much greater in “entice to stay” model networks, which has qualitative features of the marginal value theorem, and thereby provides a framework for optimal foraging behavior. We offer suggestions as to how our models could be discriminatively tested through the analysis of electrophysiological and behavioral data. Author summary Many decisions are of the ilk of whether to continue sampling a stimulus or to switch to an alternative, a key feature of foraging behavior. We produce two classes of model for such stay-switch decisions, which differ in how decisions to switch stimuli can arise. In an “entice-to-stay” model, a reduction in the necessary positive stimulus input causes switching decisions. In a “repel-to-leave” model, a rise in aversive stimulus input produces a switch decision. We find that in tasks where the sampling of one stimulus follows another, adaptive biological processes arising from a highly hedonic stimulus can reduce the time spent at the following stimulus, by up to ten-fold in the “entice-to-stay” models. Along with potentially observable behavioral differences that could distinguish the classes of networks, we also found signatures in neural activity, such as oscillation of neural firing rates and a rapid change in rates preceding the time of choice to leave a stimulus. In summary, our model findings lead to testable predictions and suggest a neural circuit-based framework for explaining foraging choices.
Article
Animals engage in routine behavior in order to efficiently navigate their environments. This routine behavior may be influenced by the state of the environment, such as the location and size of rewards. The neural circuits tracking environmental information and how that information impacts decisions to deviate from routines remains unexplored. To investigate the representation of environmental information during routine foraging, we recorded the activity of single neurons in posterior cingulate cortex (PCC) in two male monkeys searching through an array of targets in which the location of rewards was unknown. Outside the laboratory, people and animals solve such traveling salesman problems by following routine traplines that connect nearest-neighbor locations. In our task, monkeys also deployed traplining routines, but as the environment became better known, they deviate from them despite the reduction in foraging efficiency. While foraging, PCC neurons tracked environmental information but not reward and predicted variability in the pattern of choices. Together, these findings suggest that PCC may mediate the influence of information on variability in choice behavior.Significance statementMany animals seek information to better guide their decisions and update behavioral routines. In our study, subjects visually searched through a set of targets on every trial to gather two rewards. Greater amounts of information about the distribution of rewards predicted less variability in choice patterns, whereas smaller amounts predicted greater variability. We recorded from the posterior cingulate cortex, an area implicated in the coding of reward and uncertainty, and discovered that these neurons signaled the expected information about the distribution of rewards instead of signaling expected rewards. The activity in these cells also predicted the amount of variability in choice behavior. These findings suggest that the posterior cingulate helps direct the search for information in order to augment routines.
Preprint
Sex differences in cognitive processes could set the stage for sex-modulated vulnerability to neuropsychiatric disorders. While value-based decision making processes in particular have been proposed to be influenced by sex differences, the overall correct performance across sexes often show minimal differences. Computational tools allow us to uncover latent variables in reinforcement learning that define different decision making approaches, even in animals with similar correct performance. Here, we quantify sex differences in latent variables underlying behavior in a classic value-based decision-making task: a restless 2-armed bandit. While males and females had similar accuracy, they achieved this performance via different patterns of exploration. Males made more exploratory choices overall, largely because they appeared to get stuck in exploration once they had started. Females explored less, but learned more quickly when they did so. Together, these results suggest that sex exerts stronger influences on learning and decision making during periods of self-initiated exploration than during stable choices. These findings pinpoint the neural mechanisms of exploration as potentially conferring sex-biased vulnerability to addictions, neurodevelopmental disabilities, and other neuropsychiatric disorders.
Preprint
Full-text available
While many non-human animals show basic exploratory behaviors, it remains unclear whether any animals possess human-like curiosity. We propose that human-like curiosity satisfies three formal criteria: (1) willingness to pay (or to sacrifice reward) to obtain information, (2) that the information provides no instrumental or strategic benefit (and the subject understands this), and (3) the amount the subject is willing to pay scales with the amount of information available. Although previous work, including our own, demonstrates that some animals will sacrifice juice rewards for information, that information normally predicts upcoming rewards and their ostensible curiosity may therefore be a byproduct of reinforcement processes. Here we get around this potential confound by showing that macaques sacrifice juice to obtain information about counterfactual outcomes (outcomes that could have occurred had the subject chosen differently). Moreover, willingness-to-pay scales with the information (Shannon entropy) offered by the counterfactual option. These results demonstrate human-like curiosity in non-human animals according to our strict criteria, which circumvent several confounds associated with less stringent criteria.
Article
Study objectives: Reduced gray matter volume in the posterior cingulate cortex (PCC) has recently been found in patients with NREM parasomnia, providing a neuroanatomical substrate for the arousal state dissociation. It remains unclear whether PCC changes in NREM parasomnias might also play a role for cognitive or affective dysfunction in these patients. Aim of this exploratory study was to investigate neurobehavioral correlates of PCC abnormalities in patients with NREM parasomnia. Methods: The Reinforcement Sensitivity Theory of Personality Questionnaire (RST-PQ) and the Stress Coping Questionnaire (SVF-120) were used to assess personality and stress coping in 15 patients with NREM parasomnia and 15 age and sex-matched healthy controls. Patients' left PCC gray matter volume was quantified with voxel-based morphometry on 3 Tesla T1-weighted magnetic resonance imaging (MRI) data. Results: In the RST-PQ, increased trait reactivity of the behavioral inhibition system and goal-drive persistence contributed most to the discrimination of patients and controls. In the SVF-120, patients showed an increased negative coping trait (i.e. anxious rumination) related to an increase of adjusted left PCC volume. Conclusions: The results suggest subclinical behavioral abnormalities in patients with NREM parasomnias. Such traits might trigger maladaptive emotion regulation processes related to a relative PCC volume increase. The findings encourage further longitudinal studies on this topic, which can provide insights into the causal relations underlying the PCC volume - behavior correlation. Such future studies will have a more direct implication on the clinical management of patients with NREM parasomnias.
Article
Full-text available
Decision-making under conditions of the lack of sufficient information is associated with hypotheses construction, verification and refinement. In a novel environment subjects encounter high uncertainty; thus their behavior needs to be variable and aimed at testing the range of multiple options available; such variability allows acquiring information about the environment and finding the most beneficial options. This type of behavior is referred to as exploration. As soon as the internal model of the environment has been formed, the other strategy known as exploitation becomes preferential; exploitation presupposes using profitable options that have already been discovered by the subject. In a changing or complex (probabilistic) environment, it is important to combine these two strategies: research strategies to detect changes in the environment and utilization strategies to benefit from the familiar options. The exploration-exploitation balance is a hot topic in psychology, neurobiology, and neuroeconomics. In this review, we discuss factors that influence exploration-exploitation balance and its neurophysiological basis, decision-making mechanisms under uncertainty, and switching between them. We address the roles of major brain areas involved in these processes such as locus coeruleus, anterior cingulate cortex, frontopolar cortex, and we describe functions of some important neurotransmitters involved in these processes – dopamine, norepinephrine, and acetylcholine.
Article
Foragers often systematically deviate from rate-maximizing choices in two ways: accuracy and precision. That is, they use suboptimal threshold values and also show variability in their application of those thresholds. We hypothesized that these biases are related and, more specifically, that foragers' widely known accuracy bias––over-staying––could be explained, at least in part, by their imprecision. To test this hypothesis, we analysed choices made by three rhesus macaques in a computerized patch foraging task. Confirming previously observed findings, we found high levels of variability. We then showed, through simulations, that this variability changed optimal thresholds, meaning that a forager aware of its own variability should increase its leaving threshold (i.e. over-stay) to increase performance. All subjects showed thresholds that were biased in the predicted direction. These results indicate that over-staying in patches may reflect, in part, an adaptation to behavioural variability.
Preprint
Foragers often systematically deviate from rate-maximizing choices in two ways: in accuracy and precision. That is, they both use suboptimal threshold values and show variability in their application of those thresholds. We hypothesized that these biases are related and, more specifically, that foragers' widely known accuracy bias, known as over-staying, could be explained, at least in part, by their precision bias. To test this hypothesis, we analyzed choices made by three rhesus macaques in a computerized patch foraging task. Confirming previously observed findings, we find high levels of variability. We then show, through simulations, that this variability changes optimal thresholds, meaning that a forager aware of its own variability should increase its leaving threshold (i.e., over-stay) to increase performance. All subjects showed thresholds that were biased in the predicted direction. These results indicate that over-staying in patches may reflect, in part, an adaptation to behavioral variability.
Preprint
Full-text available
Reported sex differences in decision-making and learning can be inconsistent across studies. One interpretation is that these sex differences are not driven by differences in ability, but by differences in strategy, which interact with task design. Here, we examined the strategies adopted by female and male mice as they learned the value of stimuli that varied across two dimensions. Female mice mastered image-value associations more quickly than male mice, and that they used a fundamentally different strategy to do so. Female mice constrained their decision-space early in learning. Conversely, male strategies changed frequently and were more influenced by the stochastic rewards. Individual strategies were related to sex-gated changes in neuronal activation in early learning. Together, we find that sex drives divergent strategies for learning, revealing substantial unrecognized variability in reward-guided decision-making and learning.
Article
Full-text available
Chronic pain is common in people with Parkinson's disease, and is often considered to be caused by the motor impairments associated with the disease. Altered top‐down processing of pain characterises several chronic pain conditions and occurs when the cortex modifies nociceptive processing in the brain and spinal cord. This contrasts with bottom‐up modulation of pain whereby nociceptive processing is modified on its way up to the brain. Although several studies have demonstrated altered bottom‐up pain processing in Parkinson's, the contribution of enhanced anticipation to pain and atypical top‐down processing of pain has not been fully explored. During the anticipation to noxious stimuli, EEG source localisation reported an increased activation in the mid‐cingulate cortex and supplementary motor area in the Parkinson's disease group compared to the healthy control group during Mid [‐1500 ‐1000] and Late anticipation [‐500 0], indicating enhanced cortical activity before noxious stimulation. The Parkinson's disease group was also more sensitive to the laser and required a lower voltage level to induce pain. This study provides evidence supporting the hypothesis that enhanced top‐down processing of pain may contribute to the development of chronic pain in Parkinson's. Additional research to establish whether the altered anticipatory response is unique to noxious stimuli is required as no control stimulus was used within the current study. With further research to confirm these findings, our results inform a scientific rationale for novel treatment strategies of pain in Parkinson's disease, including mindfulness, cognitive therapies and other approaches targeted at improving top down processing of pain. This article is protected by copyright. All rights reserved.
Article
Full-text available
Training-induced neuronal activity develops in the mammalian limbic system during discriminative avoidance conditioning. This study explores behaviorally relevant changes in muscarinic ACh receptor binding in 52 rabbits that were trained to one of five stages of conditioned response acquisition. Sixteen naive and 10 animals yoked to criterion performance served as control cases. Upon reaching a particular stage of training, the brains were removed and autoradiographically assayed for 3H-oxotremorine-M binding with 50 nM pirenzepine (OXO-M/PZ) or for 3H-pirenzepine binding in nine limbic thalamic nuclei and cingulate cortex. Specific OXO-M/PZ binding increased in the parvocellular division of the anterodorsal nucleus early in training when the animals were first exposed to pairing of the conditional and unconditional stimuli. Elevated binding in this nucleus was maintained throughout subsequent training. In the parvocellular division of the anteroventral nucleus (AVp), OXO-M/PZ binding progressively increased throughout training, reached a peak at the criterion stage of performance, and returned to control values during extinction sessions. Peak OXO-M/PZ binding in AVp was significantly elevated over that for cases yoked to criterion performance. In the magnocellular division of the anteroventral nucleus (AVm), OXO-M/PZ binding was elevated only during criterion performance of the task, and it was unaltered in any other limbic thalamic nuclei. Specific OXO-M/PZ binding was also elevated in most layers in rostral area 29c when subjects first performed a significant behavioral discrimination. Training-induced alterations in OXO-M/PZ binding in AVp and layer Ia of area 29c were similar and highly correlated.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
Full-text available
Most natural actions are chosen voluntarily from many possible choices. An action is often chosen based on the reward that it is expected to produce. What kind of cellular activity in which area of the cerebral cortex is involved in selecting an action according to the expected reward value? Results of an analysis in monkeys of cellular activity during the performance of reward-based motor selection and the effects of chemical inactivation are presented. We suggest that cells in the rostral cingulate motor area, one of the higher order motor areas in the cortex, play a part in processing the reward information for motor selection.
Article
Full-text available
The Eyelink Toolbox software supports the measurement of eye movements. The toolbox provides an interface between a high-level interpreted language (MATLAB), a visual display programming toolbox (Psychophysics Toolbox), and a video-based eyetracker (Eyelink). The Eyelink Toolbox enables experimenters to measure eye movements while simultaneously executing the stimulus presentation routines provided by the Psychophysics Toolbox. Example programs are included with the toolbox distribution. Information on the Eyelink Toolbox can be found at http://psychtoolbox.org/.
Article
Full-text available
Previous neurophysiological studies have reported that neurons in posterior cingulate cortex (PCC) respond after eye movements, and that these responses may vary with ambient illumination. In monkeys, PCC neurons also respond after the illumination of large visual patterns but not after the illumination of small visual targets on either reflexive saccade tasks or peripheral attention tasks. These observations suggest that neuronal activity in PCC is modulated by behavioral context, which varies with the timing and spatial distribution of visual and oculomotor events. To test this hypothesis, we measured the spatial and temporal response properties of single PCC neurons in monkeys performing saccades in which target location and movement timing varied unpredictably. Specifically, an unsignaled delay between target onset and movement onset permitted us to temporally dissociate changes in PCC activity associated with either event. Response fields constructed from these data demonstrated that many PCC neurons were activated after the illumination of small contralateral visual targets, as well as after the onset of contraversive saccades guided by those targets. In addition, the PCC population maintained selectivity for small contralateral targets during delays of up to 600 ms. Overall, PCC activation was highly variable trial to trial and selective for a broad range of directions and amplitudes. Planar functions described response fields nearly as well as broadly tuned 2-dimensional Gaussian functions. Additionally, the overall responsiveness of PCC neurons decreased during delays when both a fixation stimulus and a saccade target were visible, suggesting a modulation by divided attention. Finally, the strength of the neuronal response after target onset was correlated with saccade accuracy on delayed-saccade trials. Thus PCC neurons may signal salient visual and oculomotor events, consistent with a role in visual orienting and attention.
Article
Full-text available
People and animals often demonstrate strong attraction or aversion to options with uncertain or risky rewards, yet the neural substrate of subjective risk preferences has rarely been investigated. Here we show that monkeys systematically preferred the risky target in a visual gambling task in which they chose between two targets offering the same mean reward but differing in reward uncertainty. Neuronal activity in posterior cingulate cortex (CGp), a brain area linked to visual orienting and reward processing, increased when monkeys made risky choices and scaled with the degree of risk. CGp activation was better predicted by the subjective salience of a chosen target than by its actual value. These data suggest that CGp signals the subjective preferences that guide visual orienting.
Article
Full-text available
Decision making in an uncertain environment poses a conflict between the opposing demands of gathering and exploiting information. In a classic illustration of this 'exploration-exploitation' dilemma, a gambler choosing between multiple slot machines balances the desire to select what seems, on the basis of accumulated experience, the richest option, against the desire to choose a less familiar option that might turn out more advantageous (and thereby provide information for improving future decisions). Far from representing idle curiosity, such exploration is often critical for organisms to discover how best to harvest resources such as food and water. In appetitive choice, substantial experimental evidence, underpinned by computational reinforcement learning (RL) theory, indicates that a dopaminergic, striatal and medial prefrontal network mediates learning to exploit. In contrast, although exploration has been well studied from both theoretical and ethological perspectives, its neural substrates are much less clear. Here we show, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma. Furthermore, using this characterization to classify decisions as exploratory or exploitative, we employ functional magnetic resonance imaging to show that the frontopolar cortex and intraparietal sulcus are preferentially active during exploratory decisions. In contrast, regions of striatum and ventromedial prefrontal cortex exhibit activity characteristic of an involvement in value-based exploitative decision making. The results suggest a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes, and provide a computationally precise characterization of the contribution of key decision-related brain systems to each of these functions.
Article
Full-text available
The desire to seek new and unfamiliar experiences is a fundamental behavioral tendency in humans and other species. In economic decision making, novelty seeking is often rational, insofar as uncertain options may prove valuable and advantageous in the long run. Here, we show that, even when the degree of perceptual familiarity of an option is unrelated to choice outcome, novelty nevertheless drives choice behavior. Using functional magnetic resonance imaging (fMRI), we show that this behavior is specifically associated with striatal activity, in a manner consistent with computational accounts of decision making under uncertainty. Furthermore, this activity predicts interindividual differences in susceptibility to novelty. These data indicate that the brain uses perceptual novelty to approximate choice uncertainty in decision making, which in certain contexts gives rise to a newly identified and quantifiable source of human irrationality.
Article
We consider a population of n projects which in general continue to evolve whether in operation or not (although by different rules). It is desired to choose the projects in operation at each instant of time so as to maximise the expected rate of reward, under a constraint upon the expected number of projects in operation. The Lagrange multiplier associated with this constraint defines an index which reduces to the Gittins index when projects not being operated are static. If one is constrained to operate m projects exactly then arguments are advanced to support the conjecture that, for m and n large in constant ratio, the policy of operating the m projects of largest current index is nearly optimal. The index is evaluated for some particular projects.
Article
The orbitofrontal cortex (OFC) has been implicated in reinforcement-guided decision making, error monitoring, and the reversal of behavior in response to changing circumstances. The anterior cingulate cortex sulcus (ACC(S)), however, has also been implicated in similar aspects of behavior. Dissociating the unique functions of these areas would improve our understanding of the decision-making process. The effect of selective OFC lesions on how monkeys used the history of reinforcement to guide choices of either particular actions or particular stimuli was studied and compared with the effects of ACC(S) lesions. Both lesions disrupted decision making, but their effects were differentially modulated by the dependence on action- or stimulus-value contingencies. OFC lesions caused a deficit in stimulus but not action selection, whereas ACC(S) lesions had the opposite effect, disrupting action but not stimulus selection. Furthermore, OFC lesions that have previously been found to impair decision making when deterministic stimulus-reward contingencies are switched were found to cause a more general learning impairment in more naturalistic situations in which reward was stochastic. Both OFC and ACC(S) are essential for reinforcement-guided decision making rather than just error monitoring or behavioral reversal. The OFC and ACC(S) are both, however, more concerned with learning and making decisions, but their roles in selecting between stimulus and action values are distinct.
Article
Adaptive decision making requires selecting an action and then monitoring its consequences to improve future decisions. The neuronal mechanisms supporting action evaluation and subsequent behavioral modification, however, remain poorly understood. To investigate the contribution of posterior cingulate cortex (CGp) to these processes, we recorded activity of single neurons in monkeys performing a gambling task in which the reward outcome of each choice strongly influenced subsequent choices. We found that CGp neurons signaled reward outcomes in a nonlinear fashion and that outcome-contingent modulations in firing rate persisted into subsequent trials. Moreover, firing rate on any one trial predicted switching to the alternative option on the next trial. Finally, microstimulation in CGp following risky choices promoted a preference reversal for the safe option on the following trial. Collectively, these results demonstrate that CGp directly contributes to the evaluative processes that support dynamic changes in decision making in volatile environments.
Article
The cingulate gyrus is a major part of the "anatomical limbic system" and, according to classic accounts, is involved in emotion. This view is oversimplified in light of recent clinical and experimental findings that cingulate cortex participates not only in emotion but also in sensory, motor, and cognitive processes. Anterior cingulate cortex, consisting of areas 25 and 24, has been implicated in visceromotor, skeletomotor, and endocrine outflow. These processes include responses to painful stimuli, maternal behavior, vocalization, and attention to action. Since all of these activities have an affective component, it is likely that connections with the amygdala are critical for them. In contrast, posterior cingulate cortex, consisting of areas 29, 30, 23, and 31, contains neurons that monitor eye movements and respond to sensory stimuli. Ablation studies suggest that this region is involved in spatial orientation and memory. It is likely that connections between posterior cingulate and parahippocampal cortices contribute to these processes. We conclude that there is a fundamental dichotomy between the functions of anterior and posterior cingulate cortices. The anterior cortex subserves primarily executive functions related to the emotional control of visceral, skeletal, and endocrine outflow. The posterior cortex subserves evaluative functions such as monitoring sensory events and the organism's own behavior in the service of spatial orientation and memory.
Article
This study extends an ongoing analysis of the neural mediation of discriminative avoidance learning in rabbits. Electrolytic lesions encompassing anterior and posterior cingulate cortex (area 24 and 29) or ibotenic acid lesions in area 24 only were made prior to avoidance conditioning wherein rabbits learned to step in response to a tone conditional stimulus (CS+) in order to avoid a brief, response-terminated 1.5 mA. foot-shock unconditional stimulus (US). The US was presented 5 s after CS+ onset, in the absence of a prior stepping response. The rabbits also learned to ignore a different tone (CS-) not followed by the US. Multi-unit activity of the caudate and medial dorsal (MD) thalamic nuclei, projection targets of the cingulate cortex, was recorded during learning in all rabbits. Activity was also recorded in area 29 in the rabbits with area 24 lesions. Learning in rabbits with combined lesions was severely impaired and it was moderately retarded after lesions in area 24. MD thalamic and caudate training-induced neuronal discharge increments elicited by the CS+ were enhanced in rabbits with lesions, suggesting a suppressive influence of cingulate cortical projections on this activity. Early-, but not late-developing training-induced unit activity in area 29c/d was absent in rabbits with area 24 lesions, indicating that area 24 is a source of early-developing area 29 plasticity. These results are consistent with hypotheses of a theoretical working model, stating that: a) learning depends on the integrity of two functional systems, a mnemonic recency system comprised by circuitry involving area 24 and the MD nucleus and a mnemonic primacy system comprised by circuitry involving area 29 and the anterior thalamic nuclei; b) corticothalamic information flow in these systems suppresses thalamic CS elicited activity in trained rabbits; c) corticostriatal information flow is involved in avoidance response initiation. An absence of rhythmic theta-like neuronal bursts in area 29b in rabbits with area 24 lesions is attributable to passing fiber damage.
Article
Past studies of the neural determinants of discriminative avoidance conditioning in rabbits have fostered a theoretical model that describes the interactive functioning of the cingulate cortex (Brodmann's Areas 24 and 29), the anterior ventral and medial dorsal thalamic nuclei (AVN and MDN) and the hippocampus. Here we test hypotheses of the model concerning the influence of the hippocampus on cortical and thalamic information processing. The rabbits learned to perform locomotory conditioned responses (CRs) in an activity wheel in response to an acoustic (pure tone) positive conditional stimulus (CS+). A shock unconditional stimulus (US) was given 5 s after CS+ onset, but locomotion during the CS+ - US interval prevented the US. The rabbits also learned to ignore a second tone (a negative conditional stimulus, CS-) of different auditory frequency than the CS+, that did not predict the US. Multi-unit activity and intracranial macropotentials were recorded in the cingulate cortex and the AVN during acquisition, overtraining, extinction, reacquisition and reversal training. Data were obtained in intact rabbits and in rabbits with bilateral lesions of the subicular complex, the origin of projections of the hippocampal formation to the cingulate cortex and AVN. In addition, the activity in the AVN was recorded in a separate group of rabbits with posterior cingulate cortical (Area 29) lesions. Subicular and Area 29 lesions were associated with an enhancement of the training-induced CS+ elicited neuronal response in the AVN. The frequency of CRs was enhanced in animals with subicular lesions. CS elicited unit responses in the cingulate cortices were attenuated in rabbits with subicular lesions. Both of the lesions were associated with significantly increased amplitudes of the CS elicited average cortical and thalamic macropotentials. These results suggested the following conclusions: subiculocortical afferents provide an enabling influence that is essential for CS elicited excitation in the cingulate cortex; the cingulate cortical excitatory response in intact animals exerts a limiting influence on the activity in the AVN; the enhanced AVN neuronal response in rabbits with lesions is due to the absence of the limiting influence and it contributes to the increased CR frequency in those animals. It is hypothesized that the hippocampus via subiculocortical projections, governs the flow of CR-inducing thalamocortical excitatory volleys. This governance determines the timing of CR output. The results of hippocampal processing of contextual information acting through the subiculocortical projection determines the moment most appropriate for the CR.
Article
Neurons in deep laminae of the rabbit cingulate cortex develop discriminative activity at an early stage of behavioral discrimination learning, whereas neurons in the anteroventral nucleus of thalamus and neurons in the superficial cortical laminae develop such activity in a late stage of behavioral learning. It is hypothesized that early-forming discriminative neuronal activity, relayed to anteroventral neurons via the corticothalamic pathway, contributes to the construction of changes underlying the late-forming neuronal discrimination in the anteroventral nucleus. The resultant late discriminative activity in the anteroventral nucleus is then relayed via the thalamocortical pathway back to the superficial cortical laminae, promoting disengagement of cortex from further task-processing.
Article
The Psychophysics Toolbox is a software package that supports visual psychophysics. Its routines provide an interface between a high-level interpreted language (MATLAB on the Macintosh) and the video display hardware. A set of example programs is included with the Toolbox distribution.
Article
We investigated the cortical afferents of the retrosplenial cortex and the adjacent posterior cingulate cortex (area 23) in the macaque monkey by using the retrograde tracers Fast blue and Diamidino yellow. We quantitatively analyzed the distribution of labeled neurons throughout the cortical mantle. Injections involving the retrosplenial cortex resulted in labeled neurons within the retrosplenial cortex and in areas 23 and 31 (approximately 78% of the total labeled cells). In the remainder of the cortex, the heaviest projections originated in the hippocampal formation, including the entorhinal cortex, subiculum, presubiculum, and parasubiculum. The parahippocampal and perirhinal cortices also contained many labeled neurons, as did the prefrontal cortex, mainly in areas 46, 9, 10, and 11, and the occipital cortex, mainly area V2. Injections in area 23 also resulted in numerous labeled cells in the posterior cingulate and retrosplenial regions (approximately 67% of total labeled cells). As in the retrosplenial cortex, injections of area 23 led to many labeled neurons in the frontal cortex, although most of these cells were in areas 9 and 46. Larger numbers of retrogradely labeled cells were also distributed more widely in the posterior parietal cortex, including areas 7a, 7m, LIP, and DP. There were some labeled cells in the parahippocampal cortex. These connections are consistent with the retrosplenial cortex acting as an interface between the working memory functions in the prefrontal areas and the long-term memory encoding in the medial temporal lobe. The posterior cingulate cortex, in contrast, may be more highly associated with visuospatial functions.
Article
Movement selection depends on the outcome of prior behavior. Posterior cingulate cortex (CGp) is strongly connected with both limbic and oculomotor circuitry, and CGp neurons respond following saccades, suggesting a role in signaling the motivational outcome of gaze shifts. To test this hypothesis, single CGp neurons were studied in monkeys while they shifted gaze to visual targets for liquid rewards that varied in size or were delivered probabilistically. CGp neurons responded following saccades as well as following reward delivery, and these responses were correlated with reward size. CGp neurons also responded following the omission of predicted rewards. The timing of CGp activation and its modulation by reward could provide signals useful for updating representations of expected saccade value.
Article
Our ability to judge the consequences of our actions is central to rational decision making. A large body of evidence implicates primate prefrontal regions in the regulation of this ability. It has proven extremely difficult, however, to separate functional areas in the frontal lobes. Using functional magnetic resonance imaging, we demonstrate complementary and reciprocal roles for the human orbitofrontal (OFC) and dorsal anterior cingulate cortices (ACd) in monitoring the outcome of behavior. Activation levels in these regions were negatively correlated, with activation increasing in the ACd and decreasing in the OFC when the selected response was the result of the participant's own decision. The pattern was reversed when the selected response was guided by the experimenter rather than the participant. These results indicate that the neural mechanisms underlying the way we assess the consequences of choices differ depending on whether we are told what to do or are able to exercise our volition.
Article
Neuronal activity in posterior cingulate cortex (CGp) is modulated by visual stimulation, saccades, and eye position, suggesting a role for this area in visuospatial transformations. The goal of this study was to determine whether neuronal responses in CGp are anchored to the eyes, head, or outside the body (allocentrically). To discriminate retinocentric from nonretinocentric spatial referencing, the activity of single CGp neurons was recorded while monkeys (Macaca mulatta) performed delayed-saccade trials initiated randomly from three different starting positions to a linear array of targets passing through the neuronal response field. For most neurons, tuning curves, segregated by fixation point, aligned more closely when plotted with respect to the display than when plotted with respect to the eye, suggesting a nonretinocentric frame of reference. A second experiment differentiated between spatial referencing in coordinates anchored to the head or body and allocentric spatial referencing. Monkeys shifted gaze from a central fixation point to the array of previously used targets both before and after whole-body rotation with respect to the display. For most neurons, tuning curves, segregated by fixation position, aligned more closely when plotted as a function of target position in the room than when plotted as a function of target position with respect to the monkey. These data indicate that a population of CGp neurons encodes visuospatial events in allocentric coordinates.
Article
Making optimal decisions in the face of uncertain or incomplete information arises as a common problem in everyday behavior, but the neural processes underlying this ability remain poorly understood. A typical case is navigation, in which a subject has to search for a known goal from an unknown location. Navigating under uncertain conditions requires making decisions on the basis of the current belief about location and updating that belief based on incoming information. Here, we use functional magnetic resonance imaging during a maze navigation task to study neural activity relating to the resolution of uncertainty as subjects make sequential decisions to reach a goal. We show that distinct regions of prefrontal cortex are engaged in specific computational functions that are well described by a Bayesian model of decision making. This permits efficient goal-oriented navigation and provides new insights into decision making by humans.
Article
Learning the value of options in an uncertain environment is central to optimal decision making. The anterior cingulate cortex (ACC) has been implicated in using reinforcement information to control behavior. Here we demonstrate that the ACC's critical role in reinforcement-guided behavior is neither in detecting nor in correcting errors, but in guiding voluntary choices based on the history of actions and outcomes. ACC lesions did not impair the performance of monkeys (Macaca mulatta) immediately after errors, but made them unable to sustain rewarded responses in a reinforcement-guided choice task and to integrate risk and payoff in a dynamic foraging task. These data suggest that the ACC is essential for learning the value of actions.
Article
In a multi-agent environment, where the outcomes of one's actions change dynamically because they are related to the behavior of other beings, it becomes difficult to make an optimal decision about how to act. Although game theory provides normative solutions for decision making in groups, how such decision-making strategies are altered by experience is poorly understood. These adaptive processes might resemble reinforcement learning algorithms, which provide a general framework for finding optimal strategies in a dynamic environment. Here we investigated the role of prefrontal cortex (PFC) in dynamic decision making in monkeys. As in reinforcement learning, the animal's choice during a competitive game was biased by its choice and reward history, as well as by the strategies of its opponent. Furthermore, neurons in the dorsolateral prefrontal cortex (DLPFC) encoded the animal's past decisions and payoffs, as well as the conjunction between the two, providing signals necessary to update the estimates of expected reward. Thus, PFC might have a key role in optimizing decision-making strategies.
Article
Rapid optimization of behavior requires decisions about when to explore and when to exploit discovered resources. The mechanisms that lead to fast adaptations and their interaction with action valuation are a central issue. We show here that the anterior cingulate cortex (ACC) encodes multiple feedbacks devoted to exploration and its immediate termination. In a task that alternates exploration and exploitation periods, the ACC monitored negative and positive outcomes relevant for different adaptations. In particular, it produced signals specific of the first reward, i.e., the end of exploration. Those signals disappeared in exploitation periods but immediately transferred to the initiation of trials-a transfer comparable to learning phenomena observed for dopaminergic neurons. Importantly, these were also observed for high gamma oscillations of local field potentials shown to correlate with brain imaging signal. Thus, mechanisms of action valuation and monitoring of events/actions are combined for rapid behavioral regulation.
Neurobiology of Cingulate Cortex and Limbic Thalamus: A Comprehensive Handbook (Boston: Birkhauser) Current Biology Vol 19 No 18 1536 17 Muscarinic receptor binding increases in anterior thalamus and cingulate cortex during discriminative avoidance learning
  • B A Vogt
  • M Gabriel
  • B A Vogt
  • M Gabriel
  • L J Vogt
  • A Poremba
  • E L Jensen
  • Y Kubota
  • E Kang
Vogt, B.A., and Gabriel, M. (1993). Neurobiology of Cingulate Cortex and Limbic Thalamus: A Comprehensive Handbook (Boston: Birkhauser). Current Biology Vol 19 No 18 1536 17. Vogt, B.A., Gabriel, M., Vogt, L.J., Poremba, A., Jensen, E.L., Kubota, Y., and Kang, E. (1991). Muscarinic receptor binding increases in anterior thalamus and cingulate cortex during discriminative avoidance learning. J. Neurosci. 11, 1508–1514.
Effects of cingulate cortical lesions on avoidance learning and training-induced unit activity in rabbits
  • Gabriel