A predictive reinforcement model of dopamine neurons for learning approach behavior.

Motor Control Laboratory, Arizona State University, Tempe 85287-0404, USA.
Journal of Computational Neuroscience (Impact Factor: 2.09). 01/1999; 6(3):191-214. DOI: 10.1023/A:1008862904946
Source: PubMed

ABSTRACT A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Parkinson's disease (PD) is characterized by a degeneration of nigrostriatal dopaminergic cells, resulting in dopamine depletion. This depletion is counteracted through dopamine replacement therapy (DRT). Dopamine has been suggested to affect novelty processing and memory, which suggests that these processes are also implicated in PD and that DRT could affect them. Objective To investigate word learning and novelty processing in patients with PD as indexed by the P2 and P3 event-related potential components, and the role of DRT in these processes. Methods 21 patients with PD and 21 matched healthy controls were included. Patients with PD were tested on and off DRT in two sessions in a counterbalanced design, and healthy controls were tested twice without intervention. Electroencephalogram (EEG) was measured while participants performed a word learning Von Restorff task. Results Healthy controls showed the typical Von Restorff effect, with better memory for words that were presented in novel fonts, than for words presented in standard font. Surprisingly, this effect was reversed in the patients with PD. In line with the behavioral findings, the P3 was larger for novel than for standard font words in healthy controls, but not in patients with PD. For both groups the P2 and P3 event-related components were larger for recalled versus forgotten words. DRT did not affect these processes. Conclusions Learning of novel information is compromised in patients with PD. Likewise, the P2 and P3 components that predict successful memory encoding are reduced in PD patients. This was true both on and off DRT, suggesting that these findings reflect abnormalities in learning and memory in PD that are not resolved by dopaminergic medication.
    Neuropsychologia 09/2014; 62. DOI:10.1016/j.neuropsychologia.2014.07.016 · 3.45 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The recent research evidences show that the dopamine (DA) system in the brain is involved in various functions like reward-related learning, exploration, preparation, and execution in goal directed behavior. It is suggested that dopaminergic neurons provide a prediction error akin to the error computed in the temporal difference learning (TDL) models of reinforcement learning (RL). Houk et al. (1995) [26] proposed a biochemical model in the spine head of neurons at the striatum in the basal ganglia which generates and uses neural signals to predict reinforcement. The model explains how the DA neurons are able to predict reinforcement and how the output from these neurons might then be used to reinforce the behaviors that lead to primary reinforcement. They proposed a scheme drawing that parallels between actor–critic architecture and dopamine activity in the basal ganglia. Houk et al. (1995) [26] also proposed a biochemical model of interactions between protein molecules which supports learning earlier predictions of reinforcement in the spine head of medium spiny neurons at the striatum. However, Houk׳s proposed cellular model fails to account for the time delay between the dopaminergic and glutamatergic activity required for reward-related learning and also fails to explain the ‘eligibility trace’ condition needed in delayed tasks of associative conditioning in which a memory trace of the antecedent signal is needed at the time of a succeeding reward. In this article, we review various models of RL with an emphasis on the cellular models of RL. In particular, we emphasize biochemical models of RL, and point out the future directions.
    Neurocomputing 08/2014; 138:27–40. DOI:10.1016/j.neucom.2013.02.061 · 2.01 Impact Factor

Full-text (2 Sources)

Available from
Jun 1, 2014