Wolfram Schultz

University of Cambridge, Cambridge, England, United Kingdom

Are you Wolfram Schultz?

Claim your profile

Publications (159)1051.74 Total impact

  • Source
    Wolfram Schultz · Regina M Carelli · R Mark Wightman
    [Show abstract] [Hide abstract]
    ABSTRACT: Although rewards are physical stimuli and objects, their value for survival and reproduction is subjective. The phasic, neurophysiological and voltammetric dopamine reward prediction error response signals subjective reward value. The signal incorporates crucial reward aspects such as amount, probability, type, risk, delay and effort. Differences of dopamine release dynamics with temporal delay and effort in rodents may derive from methodological issues and require further study. Recent designs using concepts and behavioral tools from experimental economics allow to formally characterize the subjective value signal as economic utility and thus to establish a neuronal value function. With these properties, the dopamine response constitutes a utility prediction error signal.
    Preview · Article · Oct 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Primates are social animals, and their survival depends on social interactions with others. Especially important for social interactions and welfare is the observation of rewards obtained by other individuals and the comparison to own reward. The fundamental social decision variable for the comparison process is reward inequity, defined by an asymmetric reward distribution among individuals. An important brain structure for coding reward inequity may be the striatum, a component of the basal ganglia involved in goal-directed behavior. Two Rhesus monkeys were seated opposite to each other and contacted a touch sensitive table placed between them to obtain specific magnitudes of reward that were equally or unequally distributed among them. Response times in one of the animals demonstrated differential behavioral sensitivity to reward inequity. A group of neurons in the striatum showed distinct signals reflecting disadvantageous and advantageous reward inequity. These neuronal signals occurred irrespective of, or in conjunction with, own reward coding. These data demonstrate that striatal neurons of macaque monkeys sense the differences between other's and own reward. The neuronal activities are likely to contribute crucial reward information to neuronal mechanisms involved in social interactions.
    Full-text · Article · Sep 2015 · Journal of Neurophysiology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Rewards are defined by their behavioral functions in learning (positive reinforcement), approach behavior, economic choices and emotions. Dopamine neurons respond to rewards with two components, similar to higher order sensory and cognitive neurons. The initial, rapid, unselective dopamine detection component reports all salient environmental events irrespective of their reward association. It is highly sensitive to factors related to reward and thus detects a maximal number of potential rewards. It senses also aversive stimuli but reports their physical impact rather than their aversiveness. The second response component processes reward value accurately and starts early enough to prevent confusion with unrewarded stimuli and objects. It codes reward value as a numeric, quantitative utility prediction error, consistent with formal concepts of economic decision theory. Thus, the dopamine reward signal is fast, highly sensitive and appropriate for driving and updating economic decisions. This article is protected by copyright. All rights reserved. © 2015 Wiley Periodicals, Inc.
    Full-text · Article · Aug 2015 · The Journal of Comparative Neurology
  • Source
    Martin D Vestergaard · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Accurate retrospection is critical in many decision scenarios ranging from investment banking to hedonic psychology. A notoriously difficult case is to integrate previously perceived values over the duration of an experience. Failure in retrospective evaluation leads to suboptimal outcome when previous experiences are under consideration for revisit. A biologically plausible mechanism underlying evaluation of temporally extended outcomes is leaky integration of evidence. The leaky integrator favours positive temporal contrasts, in turn leading to undue emphasis on recency. To investigate choice mechanisms underlying suboptimal outcome based on retrospective evaluation, we used computational and behavioural techniques to model choice between perceived extended outcomes with different temporal profiles. Second-price auctions served to establish the perceived values of virtual coins offered sequentially to humans in a rapid monetary gambling task. Results show that lesser-valued options involving successive growth were systematically preferred to better options with declining temporal profiles. The disadvantageous inclination towards persistent growth was mitigated in some individuals in whom a longer time constant of the leaky integrator resulted in fewer violations of dominance. These results demonstrate how focusing on immediate gains is less beneficial than considering longer perspectives.
    Preview · Article · Jul 2015 · Proceedings of the Royal Society B: Biological Sciences
  • Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Rewards are crucial objects that induce learning, approach behavior, choices, and emotions. Whereas emotions are difficult to investigate in animals, the learning function is mediated by neuronal reward prediction error signals which implement basic constructs of reinforcement learning theory. These signals are found in dopamine neurons, which emit a global reward signal to striatum and frontal cortex, and in specific neurons in striatum, amygdala, and frontal cortex projecting to select neuronal populations. The approach and choice functions involve subjective value, which is objectively assessed by behavioral choices eliciting internal, subjective reward preferences. Utility is the formal mathematical characterization of subjective value and a prime decision variable in economic choice theory. It is coded as utility prediction error by phasic dopamine responses. Utility can incorporate various influences, including risk, delay, effort, and social interaction. Appropriate for formal decision mechanisms, rewards are coded as object value, action value, difference value, and chosen value by specific neurons. Although all reward, reinforcement, and decision variables are theoretical constructs, their neuronal signals constitute measurable physical implementations and as such confirm the validity of these concepts. The neuronal reward signals provide guidance for behavior while constraining the free will to act. Copyright © 2015 the American Physiological Society.
    No preview · Article · Jul 2015 · Physiological Reviews
  • Source
    William R Stauffer · Armin Lak · Peter Bossaerts · Wolfram Schultz

    Full-text · Article · Feb 2015 · The Journal of Neuroscience : The Official Journal of the Society for Neuroscience
  • Source
    William R. Stauffer · Armin Lak · Peter Bossaerts · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing. Copyright © 2015 Stauffer et al.
    Full-text · Article · Feb 2015 · The Journal of Neuroscience : The Official Journal of the Society for Neuroscience
  • Source
    Istvan Hernadi · Fabian Grabenhorst · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: The best rewards are often distant and can only be achieved by planning and decision-making over several steps. We designed a multi-step choice task in which monkeys followed internal plans to save rewards toward self-defined goals. During this self-controlled behavior, amygdala neurons showed future-oriented activity that reflected the animal's plan to obtain specific rewards several trials ahead. This prospective activity encoded crucial components of the animal's plan, including value and length of the planned choice sequence. It began on initial trials when a plan would be formed, reappeared step by step until reward receipt, and readily updated with a new sequence. It predicted performance, including errors, and typically disappeared during instructed behavior. Such prospective activity could underlie the formation and pursuit of internal plans characteristic of goal-directed behavior. The existence of neuronal planning activity in the amygdala suggests that this structure is important in guiding
    Full-text · Article · Jan 2015 · Nature Neuroscience
  • Source
    István Hernádi · Fabian Grabenhorst · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: The best rewards are often distant and can only be achieved by planning and decision-making over several steps. We designed a multi-step choice task in which monkeys followed internal plans to save rewards toward self-defined goals. During this self-controlled behavior, amygdala neurons showed future-oriented activity that reflected the animal's plan to obtain specific rewards several trials ahead. This prospective activity encoded crucial components of the animal's plan, including value and length of the planned choice sequence. It began on initial trials when a plan would be formed, reappeared step by step until reward receipt, and readily updated with a new sequence. It predicted performance, including errors, and typically disappeared during instructed behavior. Such prospective activity could underlie the formation and pursuit of internal plans characteristic of goal-directed behavior. The existence of neuronal planning activity in the amygdala suggests that this structure is important in guiding behavior toward internally generated, distant goals.
    Preview · Article · Jan 2015 · Nature Neuroscience
  • [Show abstract] [Hide abstract]
    ABSTRACT: Although there is a rich literature on the role of dopamine in value learning, much less is known about its role in using established value estimations to shape decision-making. Here we investigated the effect of dopaminergic modulation on value-based decision-making for food items in fasted healthy human participants. The Becker-deGroot-Marschak auction, which assesses subjective value, was examined in conjunction with pharmacological fMRI using a dopaminergic agonist and an antagonist. We found that dopamine enhanced the neural response to value in the inferior parietal gyrus/intraparietal sulcus, and that this effect predominated toward the end of the valuation process when an action was needed to record the value. Our results suggest that dopamine is involved in acting upon the decision, providing additional insight to the mechanisms underlying impaired decision-making in healthy individuals and clinical populations with reduced dopamine levels. Copyright © 2014 the authors 0270-6474/14/3416856-09$15.00/0.
    No preview · Article · Dec 2014 · The Journal of Neuroscience : The Official Journal of the Society for Neuroscience
  • Source
    William R Stauffer · Armin Lak · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Optimal choices require an accurate neuronal representation of economic value. In economics, utility functions are mathematical representations of subjective value that can be constructed from choices under risk. Utility usually exhibits a nonlinear relationship to physical reward value that corresponds to risk attitudes and reflects the increasing or decreasing marginal utility obtained with each additional unit of reward. Accordingly, neuronal reward responses coding utility should robustly reflect this nonlinearity. Results In two monkeys, we measured utility as a function of physical reward value from meaningful choices under risk (that adhered to first- and second-order stochastic dominance). The resulting nonlinear utility functions predicted the certainty equivalents for new gambles, indicating that the functions’ shapes were meaningful. The monkeys were risk seeking (convex utility function) for low reward and risk avoiding (concave utility function) with higher amounts. Critically, the dopamine prediction error responses at the time of reward itself reflected the nonlinear utility functions measured at the time of choices. In particular, the reward response magnitude depended on the first derivative of the utility function and thus reflected the marginal utility. Furthermore, dopamine responses recorded outside of the task reflected the marginal utility of unpredicted reward. Accordingly, these responses were sufficient to train reinforcement learning models to predict the behaviorally defined expected utility of gambles. Conclusions These data suggest a neuronal manifestation of marginal utility in dopamine neurons and indicate a common neuronal basis for fundamental explanatory constructs in animal learning theory (prediction error) and economic decision theory (marginal utility).
    Full-text · Article · Nov 2014 · Current Biology
  • Source
    Martin O'Neill · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Risk is a ubiquitous feature of the environment for all organisms. Very few things in life are achieved with absolute certainty. Therefore, it is essential that organisms process risky information efficiently to promote adaptive behaviour and enhance survival. Here we outline a clear definition of economic risk derived from economic theory and focus on two experiments in which we have shown subpopulations of single neurons in the orbitofrontal cortex of rhesus macaques that code either economic risk per se or an error-related risk signal, namely a risk prediction error. These biological risk signals are essential for processing and updating risky information in the environment to contribute to efficient decision making and adaptive behaviour.
    Preview · Article · Jun 2014 · Journal of Physiology-Paris
  • Source
    Maria A Bermudez · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Sensitivity to time, including the time of reward, guides the behaviour of all organisms. Recent research suggests that all major reward structures of the brain process the time of reward occurrence, including midbrain dopamine neurons, striatum, frontal cortex and amygdala. Neuronal reward responses in dopamine neurons, striatum and frontal cortex show temporal discounting of reward value. The prediction error signal of dopamine neurons includes the predicted time of rewards. Neurons in the striatum, frontal cortex and amygdala show responses to reward delivery and activities anticipating rewards that are sensitive to the predicted time of reward and the instantaneous reward probability. Together these data suggest that internal timing processes have several well characterized effects on neuronal reward processing.
    Preview · Article · Mar 2014 · Philosophical Transactions of The Royal Society B Biological Sciences
  • Source
    Armin Lak · William R Stauffer · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Prediction error signals enable us to learn through experience. These experiences include economic choices between different rewards that vary along multiple dimensions. Therefore, an ideal way to reinforce economic choice is to encode a prediction error that reflects the subjective value integrated across these reward dimensions. Previous studies demonstrated that dopamine prediction error responses reflect the value of singular reward attributes that include magnitude, probability, and delay. Obviously, preferences between rewards that vary along one dimension are completely determined by the manipulated variable. However, it is unknown whether dopamine prediction error responses reflect the subjective value integrated from different reward dimensions. Here, we measured the preferences between rewards that varied along multiple dimensions, and as such could not be ranked according to objective metrics. Monkeys chose between rewards that differed in amount, risk, and type. Because their choices were complete and transitive, the monkeys chose "as if" they integrated different rewards and attributes into a common scale of value. The prediction error responses of single dopamine neurons reflected the integrated subjective value inferred from the choices, rather than the singular reward attributes. Specifically, amount, risk, and reward type modulated dopamine responses exactly to the extent that they influenced economic choices, even when rewards were vastly different, such as liquid and food. This prediction error response could provide a direct updating signal for economic values.
    Full-text · Article · Jan 2014 · Proceedings of the National Academy of Sciences
  • Martin O’Neill · Wolfram Schultz

    No preview · Article · Jan 2014 · Journal of Physiology-Paris
  • Source
    Shunsuke Kobayashi · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Basic tenets of sensory processing emphasize the importance of accurate identification and discrimination of environmental objects [1]. Although this principle holds also for reward, the crucial acquisition of reward for survival would be aided by the capacity to detect objects whose rewarding properties may not be immediately apparent. Animal learning theory conceptualizes how unrewarded stimuli induce behavioral reactions in rewarded contexts due to pseudoconditioning and higher-order context conditioning [2-6]. We hypothesized that the underlying mechanisms may involve context-sensitive reward neurons. We studied short-latency activations of dopamine neurons to unrewarded, physically salient stimuli while systematically changing reward context. Dopamine neurons showed substantial activations to unrewarded stimuli and their conditioned stimuli in highly rewarded contexts. The activations decreased and often disappeared entirely with stepwise separation from rewarded contexts. The influence of reward context suggests that dopamine neurons respond to real and potential reward. The influence of reward context is compatible with the reward nature of phasic dopamine responses. The responses may facilitate rapid, default initiation of behavioral reactions in environments usually containing reward. Agents would encounter more and miss less reward, resulting in survival advantage and enhanced evolutionary fitness.
    Preview · Article · Dec 2013 · Current biology: CB
  • Source
    Raymundo Báez-Mendoza · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Where and how does the brain code reward during social behavior? Almost all elements of the brain's reward circuit are modulated during social behavior. The striatum in particular is activated by rewards in social situations. However, its role in social behavior is still poorly understood. Here, we attempt to review its participation in social behaviors of different species ranging from voles to humans. Human fMRI experiments show that the striatum is reliably active in relation to others' rewards, to reward inequity and also while learning about social agents. Social contact and rearing conditions have long-lasting effects on behavior, striatal anatomy and physiology in rodents and primates. The striatum also plays a critical role in pair-bond formation and maintenance in monogamous voles. We review recent findings from single neuron recordings showing that the striatum contains cells that link own reward to self or others' actions. These signals might be used to solve the agency-credit assignment problem: the question of whose action was responsible for the reward. Activity in the striatum has been hypothesized to integrate actions with rewards. The picture that emerges from this review is that the striatum is a general-purpose subcortical region capable of integrating social information into coding of social action and reward.
    Full-text · Article · Dec 2013 · Frontiers in Neuroscience
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mental imagery refers to percept-like experiences in the absence of sensory input. Brain imaging studies suggest common, modality-specific, neural correlates imagery and perception. We associated abstract visual stimuli with either visually presented or imagined monetary rewards and scrambled pictures. Brain images for a group of 12 participants were collected using functional magnetic resonance imaging. Statistical analysis showed that human midbrain regions were activated irrespective of the monetary rewards being imagined or visually present. A support vector machine trained on the midbrain activation patterns to the visually presented rewards predicted with 75% accuracy whether the participants imagined the monetary reward or the scrambled picture during imagination trials. Training samples were drawn from visually presented trials and classification accuracy was assessed for imagination trials. These results suggest the use of machine learning technique for classification of underlying cognitive states from brain imaging data.
    No preview · Conference Paper · Dec 2013
  • Martin O'Neill · Wolfram Schultz
    [Show abstract] [Hide abstract]
    ABSTRACT: Risk is a ubiquitous feature of life. It plays an important role in economic decisions by affecting subjective reward value. Informed decisions require accurate risk information for each choice option. However, risk is often not constant but changes dynamically in the environment. Therefore, risk information should be updated to the current risk level. Potential mechanisms involve error-driven updating, whereby differences between current and predicted risk levels (risk prediction errors) are used to obtain currently accurate risk predictions. As a major reward structure, the orbitofrontal cortex is involved in coding key reward parameters such as reward value and risk. In this study, monkeys viewed different visual stimuli indicating specific levels of risk that deviated from the overall risk predicted by a common earlier stimulus. A group of orbitofrontal neurons displayed a risk signal that tracked the discrepancy between current and predicted risk. Such neuronal signals may be involved in the updating of risk information.
    No preview · Article · Oct 2013 · The Journal of Neuroscience : The Official Journal of the Society for Neuroscience
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Social interactions provide agents with the opportunity to earn higher benefits than when acting alone and contribute to evolutionary stable strategies. A basic requirement for engaging in beneficial social interactions is to recognize the actor whose movement results in reward. Despite the recent interest in the neural basis of social interactions, the neurophysiological mechanisms identifying the actor in social reward situations are unknown. A brain structure well suited for exploring this issue is the striatum, which plays a role in movement, reward, and goal-directed behavior. In humans, the striatum is involved in social processes related to reward inequity, donations to charity, and observational learning. We studied the neurophysiology of social action for reward in rhesus monkeys performing a reward-giving task. The behavioral data showed that the animals distinguished between their own and the conspecific's reward and knew which individual acted. Striatal neurons coded primarily own reward but rarely other's reward. Importantly, the activations occurred preferentially, and in approximately similar fractions, when either the own or the conspecific's action was followed by own reward. Other striatal neurons showed social action coding without reward. Some of the social action coding disappeared when the conspecific's role was simulated by a computer, confirming a social rather than observational relationship. These findings demonstrate a role of striatal neurons in identifying the social actor and own reward in a social setting. These processes may provide basic building blocks underlying the brain's function in social interactions.
    Full-text · Article · Sep 2013 · Proceedings of the National Academy of Sciences