Overlapping Prediction Errors in Dorsal Striatum During Instrumental Learning With Juice and Money Reward in the Human Brain

Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, California, USA.
Journal of Neurophysiology (Impact Factor: 3.04). 09/2009; 102(6):3384-91. DOI: 10.1152/jn.91195.2008
Source: PubMed

ABSTRACT Prediction error signals have been reported in human imaging studies in target areas of dopamine neurons such as ventral and dorsal striatum during learning with many different types of reinforcers. However, a key question that has yet to be addressed is whether prediction error signals recruit distinct or overlapping regions of striatum and elsewhere during learning with different types of reward. To address this, we scanned 17 healthy subjects with functional magnetic resonance imaging while they chose actions to obtain either a pleasant juice reward (1 ml apple juice), or a monetary gain (5 cents) and applied a computational reinforcement learning model to subjects' behavioral and imaging data. Evidence for an overlapping prediction error signal during learning with juice and money rewards was found in a region of dorsal striatum (caudate nucleus), while prediction error signals in a subregion of ventral striatum were significantly stronger during learning with money but not juice reward. These results provide evidence for partially overlapping reward prediction signals for different types of appetitive reinforcers within the striatum, a finding with important implications for understanding the nature of associative encoding in the striatum as a function of reinforcer type.


Available from: Vivian Virag Valentin, Aug 21, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reinforcement learning describes motivated behavior in terms of two abstract signals. The representation of discrepancies between expected and actual rewards/punishments-prediction error-is thought to update the expected value of actions and predictive stimuli. Electrophysiological and lesion studies have suggested that mesostriatal prediction error signals control behavior through synaptic modification of cortico-striato-thalamic networks. Signals in the ventromedial prefrontal and orbitofrontal cortex are implicated in representing expected value. To obtain unbiased maps of these representations in the human brain, we performed a meta-analysis of functional magnetic resonance imaging studies that had employed algorithmic reinforcement learning models across a variety of experimental paradigms. We found that the ventral striatum (medial and lateral) and midbrain/thalamus represented reward prediction errors, consistent with animal studies. Prediction error signals were also seen in the frontal operculum/insula, particularly for social rewards. In Pavlovian studies, striatal prediction error signals extended into the amygdala, whereas instrumental tasks engaged the caudate. Prediction error maps were sensitive to the model-fitting procedure (fixed or individually estimated) and to the extent of spatial smoothing. A correlate of expected value was found in a posterior region of the ventromedial prefrontal cortex, caudal and medial to the orbitofrontal regions identified in animal studies. These findings highlight a reproducible motif of reinforcement learning in the cortico-striatal loops and identify methodological dimensions that may influence the reproducibility of activation patterns across studies.
    Cognitive Affective & Behavioral Neuroscience 02/2015; DOI:10.3758/s13415-015-0338-7 · 3.21 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Drugs of abuse elicit dopamine release in the ventral striatum, possibly biasing dopamine-driven reinforcement learning towards drug-related reward at the expense of non-drug-related reward. Indeed, in alcohol-dependent patients, reactivity in dopaminergic target areas is shifted from non-drug-related stimuli towards drug-related stimuli. Such ‘hijacked’ dopamine signals may impair flexible learning from non-drug-related rewards, and thus promote craving for the drug of abuse. Here, we used functional magnetic resonance imaging to measure ventral striatal activation by reward prediction errors (RPEs) during a probabilistic reversal learning task in recently detoxified alcohol-dependent patients and healthy controls (N = 27). All participants also underwent 6-[18F]fluoro-DOPA positron emission tomography to assess ventral striatal dopamine synthesis capacity. Neither ventral striatal activation by RPEs nor striatal dopamine synthesis capacity differed between groups. However, ventral striatal coding of RPEs correlated inversely with craving in patients. Furthermore, we found a negative correlation between ventral striatal coding of RPEs and dopamine synthesis capacity in healthy controls, but not in alcohol-dependent patients. Moderator analyses showed that the magnitude of the association between dopamine synthesis capacity and RPE coding depended on the amount of chronic, habitual alcohol intake. Despite the relatively small sample size, a power analysis supports the reported results. Using a multimodal imaging approach, this study suggests that dopaminergic modulation of neural learning signals is disrupted in alcohol dependence in proportion to long-term alcohol intake of patients. Alcohol intake may perpetuate itself by interfering with dopaminergic modulation of neural learning signals in the ventral striatum, thus increasing craving for habitual drug intake.
    European Journal of Neuroscience 12/2014; 41(4). DOI:10.1111/ejn.12802 · 3.67 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The firing pattern of midbrain dopamine (DA) neurons is well known to reflect reward prediction errors (PEs), the difference between obtained and expected rewards. The PE is thought to be a crucial signal for instrumental learning, and interference with DA transmission impairs learning. Phasic increases of DA neuron firing during positive PEs are driven by activation of NMDA receptors, whereas phasic suppression of firing during negative PEs is likely mediated by inputs from the lateral habenula. We aimed to determine the contribution of DA D2-class and NMDA receptors to appetitively and aversively motivated reinforcement learning. Healthy human volunteers were scanned with functional magnetic resonance imaging while they performed an instrumental learning task under the influence of either the DA D2 receptor antagonist amisulpride (400 mg), the NMDA receptor antagonist memantine (20 mg), or placebo. Participants quickly learned to select ("approach") rewarding and to reject ("avoid") punishing options. Amisulpride impaired both approach and avoidance learning, while memantine mildly attenuated approach learning but had no effect on avoidance learning. These behavioral effects of the antagonists were paralleled by their modulation of striatal PEs. Amisulpride reduced both appetitive and aversive PEs, while memantine diminished appetitive, but not aversive PEs. These data suggest that striatal D2-class receptors contribute to both approach and avoidance learning by detecting both the phasic DA increases and decreases during appetitive and aversive PEs. NMDA receptors on the contrary appear to be required only for approach learning because phasic DA increases during positive PEs are NMDA dependent, whereas phasic decreases during negative PEs are not.
    The Journal of Neuroscience : The Official Journal of the Society for Neuroscience 09/2014; 34(39):13151-62. DOI:10.1523/JNEUROSCI.0757-14.2014 · 6.75 Impact Factor