Wolfram Schultz

University of Cambridge, Cambridge, England, United Kingdom

Are you Wolfram Schultz?

Claim your profile

Publications (143)943.16 Total impact

  • Martin O'Neill, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Risk is a ubiquitous feature of the environment for all organisms. Very few things in life are achieved with absolute certainty. Therefore, it is essential that organisms process risky information efficiently to promote adaptive behaviour and enhance survival. Here we outline a clear definition of economic risk derived from economic theory and focus on two experiments in which we have shown subpopulations of single neurons in the orbitofrontal cortex of rhesus macaques that code either economic risk per se or an error-related risk signal, namely a risk prediction error. These biological risk signals are essential for processing and updating risky information in the environment to contribute to efficient decision making and adaptive behaviour.
    Journal of Physiology-Paris 06/2014; · 0.82 Impact Factor
  • Source
    Armin Lak, William R Stauffer, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Prediction error signals enable us to learn through experience. These experiences include economic choices between different rewards that vary along multiple dimensions. Therefore, an ideal way to reinforce economic choice is to encode a prediction error that reflects the subjective value integrated across these reward dimensions. Previous studies demonstrated that dopamine prediction error responses reflect the value of singular reward attributes that include magnitude, probability, and delay. Obviously, preferences between rewards that vary along one dimension are completely determined by the manipulated variable. However, it is unknown whether dopamine prediction error responses reflect the subjective value integrated from different reward dimensions. Here, we measured the preferences between rewards that varied along multiple dimensions, and as such could not be ranked according to objective metrics. Monkeys chose between rewards that differed in amount, risk, and type. Because their choices were complete and transitive, the monkeys chose "as if" they integrated different rewards and attributes into a common scale of value. The prediction error responses of single dopamine neurons reflected the integrated subjective value inferred from the choices, rather than the singular reward attributes. Specifically, amount, risk, and reward type modulated dopamine responses exactly to the extent that they influenced economic choices, even when rewards were vastly different, such as liquid and food. This prediction error response could provide a direct updating signal for economic values.
    Proceedings of the National Academy of Sciences 01/2014; · 9.81 Impact Factor
  • Maria A Bermudez, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Sensitivity to time, including the time of reward, guides the behaviour of all organisms. Recent research suggests that all major reward structures of the brain process the time of reward occurrence, including midbrain dopamine neurons, striatum, frontal cortex and amygdala. Neuronal reward responses in dopamine neurons, striatum and frontal cortex show temporal discounting of reward value. The prediction error signal of dopamine neurons includes the predicted time of rewards. Neurons in the striatum, frontal cortex and amygdala show responses to reward delivery and activities anticipating rewards that are sensitive to the predicted time of reward and the instantaneous reward probability. Together these data suggest that internal timing processes have several well characterized effects on neuronal reward processing.
    Philosophical Transactions of The Royal Society B Biological Sciences 01/2014; 369(1637):20120468. · 6.23 Impact Factor
  • Martin O’Neill, Wolfram Schultz
    Journal of Physiology-Paris 01/2014; · 0.82 Impact Factor
  • Source
    Shunsuke Kobayashi, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Basic tenets of sensory processing emphasize the importance of accurate identification and discrimination of environmental objects [1]. Although this principle holds also for reward, the crucial acquisition of reward for survival would be aided by the capacity to detect objects whose rewarding properties may not be immediately apparent. Animal learning theory conceptualizes how unrewarded stimuli induce behavioral reactions in rewarded contexts due to pseudoconditioning and higher-order context conditioning [2-6]. We hypothesized that the underlying mechanisms may involve context-sensitive reward neurons. We studied short-latency activations of dopamine neurons to unrewarded, physically salient stimuli while systematically changing reward context. Dopamine neurons showed substantial activations to unrewarded stimuli and their conditioned stimuli in highly rewarded contexts. The activations decreased and often disappeared entirely with stepwise separation from rewarded contexts. The influence of reward context suggests that dopamine neurons respond to real and potential reward. The influence of reward context is compatible with the reward nature of phasic dopamine responses. The responses may facilitate rapid, default initiation of behavioral reactions in environments usually containing reward. Agents would encounter more and miss less reward, resulting in survival advantage and enhanced evolutionary fitness.
    Current biology: CB 12/2013; · 10.99 Impact Factor
  • Martin O'Neill, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Risk is a ubiquitous feature of life. It plays an important role in economic decisions by affecting subjective reward value. Informed decisions require accurate risk information for each choice option. However, risk is often not constant but changes dynamically in the environment. Therefore, risk information should be updated to the current risk level. Potential mechanisms involve error-driven updating, whereby differences between current and predicted risk levels (risk prediction errors) are used to obtain currently accurate risk predictions. As a major reward structure, the orbitofrontal cortex is involved in coding key reward parameters such as reward value and risk. In this study, monkeys viewed different visual stimuli indicating specific levels of risk that deviated from the overall risk predicted by a common earlier stimulus. A group of orbitofrontal neurons displayed a risk signal that tracked the discrepancy between current and predicted risk. Such neuronal signals may be involved in the updating of risk information.
    Journal of Neuroscience 10/2013; 33(40):15810-15814. · 6.91 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Social interactions provide agents with the opportunity to earn higher benefits than when acting alone and contribute to evolutionary stable strategies. A basic requirement for engaging in beneficial social interactions is to recognize the actor whose movement results in reward. Despite the recent interest in the neural basis of social interactions, the neurophysiological mechanisms identifying the actor in social reward situations are unknown. A brain structure well suited for exploring this issue is the striatum, which plays a role in movement, reward, and goal-directed behavior. In humans, the striatum is involved in social processes related to reward inequity, donations to charity, and observational learning. We studied the neurophysiology of social action for reward in rhesus monkeys performing a reward-giving task. The behavioral data showed that the animals distinguished between their own and the conspecific's reward and knew which individual acted. Striatal neurons coded primarily own reward but rarely other's reward. Importantly, the activations occurred preferentially, and in approximately similar fractions, when either the own or the conspecific's action was followed by own reward. Other striatal neurons showed social action coding without reward. Some of the social action coding disappeared when the conspecific's role was simulated by a computer, confirming a social rather than observational relationship. These findings demonstrate a role of striatal neurons in identifying the social actor and own reward in a social setting. These processes may provide basic building blocks underlying the brain's function in social interactions.
    Proceedings of the National Academy of Sciences 09/2013; · 9.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To make adaptive choices, humans need to estimate the probability of future events. Based on a Bayesian approach, it is assumed that probabilities are inferred by combining a priori, potentially subjective, knowledge with factual observations, but the precise neurobiological mechanism remains unknown. Here, we study whether neural encoding centers on subjective posterior probabilities, and data merely lead to updates of posteriors, or whether objective data are encoded separately alongside subjective knowledge. During fMRI, young adults acquired prior knowledge regarding uncertain events, repeatedly observed evidence in the form of stimuli, and estimated event probabilities. Participants combined prior knowledge with factual evidence using Bayesian principles. Expected reward inferred from prior knowledge was encoded in striatum. BOLD response in specific nodes of the default mode network (angular gyri, posterior cingulate, and medial prefrontal cortex) encoded the actual frequency of stimuli, unaffected by prior knowledge. In this network, activity increased with frequencies and thus reflected the accumulation of evidence. In contrast, Bayesian posterior probabilities, computed from prior knowledge and stimulus frequencies, were encoded in bilateral inferior frontal gyrus. Here activity increased for improbable events and thus signaled the violation of Bayesian predictions. Thus, subjective beliefs and stimulus frequencies were encoded in separate cortical regions. The advantage of such a separation is that objective evidence can be recombined with newly acquired knowledge when a reinterpretation of the evidence is called for. Overall this study reveals the coexistence in the brain of an experience-based system of inference and a knowledge-based system of inference.
    Journal of Neuroscience 06/2013; 33(26):10887-10897. · 6.91 Impact Factor
  • Source
    Raymundo Báez-Mendoza, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Where and how does the brain code reward during social behavior? Almost all elements of the brain's reward circuit are modulated during social behavior. The striatum in particular is activated by rewards in social situations. However, its role in social behavior is still poorly understood. Here, we attempt to review its participation in social behaviors of different species ranging from voles to humans. Human fMRI experiments show that the striatum is reliably active in relation to others' rewards, to reward inequity and also while learning about social agents. Social contact and rearing conditions have long-lasting effects on behavior, striatal anatomy and physiology in rodents and primates. The striatum also plays a critical role in pair-bond formation and maintenance in monogamous voles. We review recent findings from single neuron recordings showing that the striatum contains cells that link own reward to self or others' actions. These signals might be used to solve the agency-credit assignment problem: the question of whose action was responsible for the reward. Activity in the striatum has been hypothesized to integrate actions with rewards. The picture that emerges from this review is that the striatum is a general-purpose subcortical region capable of integrating social information into coding of social action and reward.
    Frontiers in Neuroscience 01/2013; 7:233.
  • Source
    Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent work has advanced our knowledge of phasic dopamine reward prediction error signals. The error signal is bidirectional, reflects well the higher order prediction error described by temporal difference learning models, is compatible with model-free and model-based reinforcement learning, reports the subjective rather than physical reward value during temporal discounting and reflects subjective stimulus perception rather than physical stimulus aspects. Dopamine activations are primarily driven by reward, and to some extent risk, whereas punishment and salience have only limited activating effects when appropriate controls are respected. The signal is homogeneous in terms of time course but heterogeneous in many other aspects. It is essential for synaptic plasticity and a range of behavioural learning situations.
    Current opinion in neurobiology 12/2012; · 7.21 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The amygdala is a key structure of the brain's reward system. Existing theories view its role in decision-making as restricted to an early valuation stage that provides input to decision mechanisms in downstream brain structures. However, the extent to which the amygdala itself codes information about economic choices is unclear. Here, we report that individual neurons in the primate amygdala predict behavioral choices in an economic decision task. We recorded the activity of amygdala neurons while monkeys chose between saving liquid reward with interest and spending the accumulated reward. In addition to known value-related responses, we found that activity in a group of amygdala neurons predicted the monkeys' upcoming save-spend choices with an average accuracy of 78%. This choice-predictive activity occurred early in trials, even before information about specific actions associated with save-spend choices was available. For a substantial number of neurons, choice-differential activity was specific for free, internally generated economic choices and not observed in a control task involving forced imperative choices. A subgroup of choice-predictive neurons did not show relationships to value, movement direction, or visual stimulus features. Choice-predictive activity in some amygdala neurons was preceded by transient periods of value coding, suggesting value-to-choice transitions and resembling decision processes in other brain systems. These findings suggest that the amygdala might play an active role in economic decisions. Current views of amygdala function should be extended to incorporate a role in decision-making beyond valuation.
    Proceedings of the National Academy of Sciences 10/2012; · 9.81 Impact Factor
  • Source
    William Stauffer, Armin Lak, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Rewards are instrumental to everyday life; they reinforce approach behavior and provide motivation. Rewarding events are encoded by midbrain dopamine (DA) neurons, and animals will work for direct stimulation of these neurons. The phasic DA responses to rewards and reward predicting events are consistent with a temporaldifference (TD) prediction error (Schultz et al, Science, 1997). In a TD model, prediction errors indicate deviations from an underlying value function. According to economic theory, individuals’ value (or utility) function is a nonlinear transformation of objective payoffs. Here we show that DA neurons encode prediction errors derived from a nonlinear value function that reflected economic utility. Two monkeys were trained to choose between visual cues that predicted gambles involving differently sized juice rewards. The choices of both animals were complete and transitive across the choice set, which spanned a broad range of expected values. The utility curves for this range were estimated using an adaptive choice paradigm that measured certainty equivalents (CE) for the gambles. Both animals displayed a sigmoidshaped utility function with risk seeking tendencies at lower expected values and risk aversion when the expected value increased. This behavior mirrors the trend often observed in human subjects (Holt and Laury, Amer. Econ. Rev., 2002), and was stable over multiple weeks of testing. The activity of DA neurons was recorded during trials in which cues conveyed possible outcome magnitudes, and were followed by the delivery of a juice reward. The response to juice delivery was used to determine the shape of the neural value function. In gambles with the same risk but different expected values, outcome prediction errors computed from a linear value function should be of equivalent size. However, the observed prediction errors followed a pattern consistent with the first derivative of the behavioral utility function, small when the change in utility was small, and larger when the change in utility increased. This indicated that DA neurons are not signaling objective prediction errors. Then, to determine whether the DA neuron activity encoded the economic utility, phasic responses to cues predicting gambles with the same expected value, but different risks, were analyzed. In this condition, DA neurons displayed a larger response to cues that had a significantly larger CE. This suggests that DA neurons are encoding the expected utility of the CE, rather than the utility of the expected value. Together, current data suggests that DA neurons signal prediction errors derived from an economic utility function.
    Society for Neuroscience; 10/2012
  • Maria A Bermudez, Carl Göbel, Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: The time of reward and the temporal structure of reward occurrence fundamentally influence behavioral reinforcement and decision processes [1-11]. However, despite knowledge about timing in sensory and motor systems [12-17], we know little about temporal mechanisms of neuronal reward processing. In this experiment, visual stimuli predicted different instantaneous probabilities of reward occurrence that resulted in specific temporal reward structures. Licking behavior demonstrated that the animals had developed expectations for the time of reward that reflected the instantaneous reward probabilities. Neurons in the amygdala, a major component of the brain's reward system [18-29], showed two types of reward signal, both of which were sensitive to the expected time of reward. First, the time courses of anticipatory activity preceding reward delivery followed the specific instantaneous reward probabilities and thus paralleled the temporal reward structures. Second, the magnitudes of responses following reward delivery covaried with the instantaneous reward probabilities, reflecting the influence of temporal reward structures at the moment of reward delivery. In being sensitive to temporal reward structure, the reward signals of amygdala neurons reflected the temporally specific expectations of reward. The data demonstrate an active involvement of amygdala neurons in timing processes that are crucial for reward function.
    Current biology: CB 09/2012; 22(19):1839-44. · 10.99 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this short communication, we respond to Westlund's critique of the NC3Rs Working Group report on refinement of the use of food and fluid control as motivational tools for macaques used in behavioural neuroscience research. The suggestions Westlund makes – in particular, the use of conditioned reinforcers and variable ratio schedules – were considered by the Working Group but were not included in the report as specific recommendations. We outline the reasons for this and also address some misunderstandings of our position.
    Journal of neuroscience methods 02/2012; 204(1):206–209. · 2.30 Impact Factor
  • Wolfram Schultz
    Biological psychiatry 02/2012; 71(3):180-1. · 8.93 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Monetary rewards are uniquely human. Because money is easy to quantify and present visually, it is the reward of choice for most fMRI studies, even though it cannot be handed over to participants inside the scanner. A typical fMRI study requires hundreds of trials and thus small amounts of monetary rewards per trial (e.g. 5p) if all trials are to be treated equally. However, small payoffs can have detrimental effects on performance due to their limited buying power. Hypothetical monetary rewards can overcome the limitations of smaller monetary rewards but it is less well known whether predictors of hypothetical rewards activate reward regions. In two experiments, visual stimuli were associated with hypothetical monetary rewards. In Experiment 1, we used stimuli predicting either visually presented or imagined hypothetical monetary rewards, together with non-rewarding control pictures. Activations to reward predictive stimuli occurred in reward regions, namely the medial orbitofrontal cortex and midbrain. In Experiment 2, we parametrically varied the amount of visually presented hypothetical monetary reward keeping constant the amount of actually received reward. Graded activation in midbrain was observed to stimuli predicting increasing hypothetical rewards. The results demonstrate the efficacy of using hypothetical monetary rewards in fMRI studies.
    NeuroImage 01/2012; 59(2):1692-9. · 6.25 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dopamine projections that extend from the ventral tegmental area to the striatum have been implicated in the biological basis for behaviors associated with reward and addiction. Until recently, it has been difficult to evaluate the complex balance of energy utilization and neural activity in the striatum. Many techniques such as electrophysiology, functional magnetic resonance imaging (fMRI), and fast-scan cyclic voltammetry have been employed to monitor these neurochemical and neurophysiological changes. In this brain region, physiological responses to cues and rewards cause local, transient pH changes. Oxygen and pH are coupled in the brain through a complex system of blood flow and metabolism as a result of transient neural activity. Indeed, this balance is at the heart of imaging studies such as fMRI. To this end, we measured pH and O(2) changes with fast-scan cyclic voltammetry in the striatum as indices of changes in metabolism and blood flow in vivo in three Macaca mulatta monkeys during reward-based behaviors. Specifically, the animals were presented with Pavlovian conditioned cues that predicted different probabilities of liquid reward. They also received free reward without predictive cues. The primary detected change consisted of pH shifts in the striatal extracellular environment following the reward predicting cues or the free reward. We observed three types of cue responses that consisted of purely basic pH shifts, basic pH shifts followed by acidic pH shifts, and purely acidic pH shifts. These responses increased with reward probability, but were not significantly different from each other. The pH changes were accompanied by increases in extracellular O(2). The changes in pH and extracellular O(2) are consistent with current theories of metabolism and blood flow. However, they were of sufficient magnitude that they masked dopamine changes in the majority of cases. The findings suggest a role of these chemical responses in neuronal reward processing.
    Frontiers in Behavioral Neuroscience 01/2012; 6:36. · 4.76 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rewards can be viewed as probability distributions of reward values. Besides expected (mean) value, a key parameter of such distributions is variance (or standard deviation), which constitutes a measure of risk. Single neurons in orbitofrontal cortex signal risk mostly separately from value. Comparable risk signals in human frontal cortex reflect risk attitudes of individual participants. Subjective outcome value constitutes the primary economic decision variable. The terms risk avoidance and risk taking suggest that risk affects subjective outcome value, a basic tenet of economic decision theories. Correspondingly, risk reduces neuronal value signals in frontal cortex of human risk avoiders and enhances value signals in risk takers. Behavioral contrast effects and reference-dependent valuation demonstrate flexible reward valuation. As a potential correlate, value signals in orbitofrontal neurons adjust reward discrimination to variance (risk). These neurophysiological mechanisms of reward risk on economic decisions inform and validate theories of economic decision making under uncertainty.
    Annals of the New York Academy of Sciences 12/2011; 1239:109-17. · 4.38 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Decision-making can be broken down into several component processes: assigning values to stimuli under consideration, selecting an option by comparing those values, and initiating motor responses to obtain the reward. Although much is known about the neural encoding of stimulus values and motor commands, little is known about the mechanisms through which stimulus values are compared, and the resulting decision is transmitted to motor systems. We investigated this process using human fMRI in a task where choices were indicated using the left or right hand. We found evidence consistent with the hypothesis that value signals are computed in the ventral medial prefrontal cortex, they are passed to regions of dorsomedial prefrontal cortex and intraparietal sulcus, implementing a comparison process, and the output of the comparator regions modulates activity in motor cortex to implement the choice. These results describe the network through which stimulus values are transformed into actions during a simple choice task.
    Proceedings of the National Academy of Sciences 11/2011; 108(44):18120-5. · 9.81 Impact Factor
  • Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: How do addictive drugs hijack the brain's reward system? This review speculates how normal, physiological reward processes may be affected by addictive drugs. Addictive drugs affect acute responses and plasticity in dopamine neurons and postsynaptic structures. These effects reduce reward discrimination, increase the effects of reward prediction error signals, and enhance neuronal responses to reward-predicting stimuli, which may contribute to compulsion. Addictive drugs steepen neuronal temporal reward discounting and create temporal myopia that impairs the control of drug taking. Tonically enhanced dopamine levels may disturb working memory mechanisms necessary for assessing background rewards and thus may generate inaccurate neuronal reward predictions. Drug-induced working memory deficits may impair neuronal risk signaling, promote risky behaviors, and facilitate preaddictive drug use. Malfunctioning adaptive reward coding may lead to overvaluation of drug rewards. Many of these malfunctions may result in inadequate neuronal decision mechanisms and lead to choices biased toward drug rewards.
    Neuron 02/2011; 69(4):603-17. · 15.77 Impact Factor

Publication Stats

20k Citations
943.16 Total Impact Points


  • 2003–2014
    • University of Cambridge
      • Department of Physiology, Development and Neuroscience
      Cambridge, England, United Kingdom
  • 2006–2010
    • Tamagawa University
      • • Tamagawa University Brain Science Institute
      • • Basic Brain Science Research Center
      Tokyo, Tokyo-to, Japan
  • 2005–2008
    • Stanford University
      • Department of Neurobiology
      Stanford, CA, United States
  • 2003–2006
    • Universität Basel
      Bâle, Basel-City, Switzerland
  • 1982–2005
    • Université de Fribourg
      • Institut de la Pathologie
      Freiburg, Fribourg, Switzerland
  • 2001
    • Tokyo Metropolitan Institute
      Edo, Tōkyō, Japan
  • 1997–2001
    • Paul Scherrer Institut
      • Center for Radiopharmaceutical Sciences (CRS)
      Villigen, AG, Switzerland
  • 1999
    • Arizona State University
      Phoenix, Arizona, United States
  • 1992
    • Universidad Nacional Autónoma de México
      • Institute of Cellular Physiology
      Mexico City, The Federal District, Mexico
  • 1978
    • Karolinska Institutet
      Solna, Stockholm, Sweden