Fabian Grabenhorst’s research while affiliated with University of Oxford and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (70)


Mechanisms of adjustments to different types of uncertainty in the reward environment across mice and monkeys
  • Preprint

October 2022

·

32 Reads

Jae Hyung Woo

·

·

Bilal A Bari

·

[...]

·

Despite being unpredictable and uncertain, reward environments often exhibit certain regularities, and animals navigating these environments try to detect and utilize such regularities to adapt their behavior. However, successful learning requires that animals also adjust to uncertainty associated with those regularities. Here, we analyzed choice data from two comparable dynamic foraging tasks in mice and monkeys to investigate mechanisms underlying adjustments to different types of uncertainty. In these tasks, animals selected between two choice options that delivered reward probabilistically, while baseline reward probabilities changed after a variable number (block) of trials without any cues to the animals. To measure adjustments in behavior, we applied a set of metrics based on information theory that quantify consistency in behavior, and fit choice data using reinforcement learning models. We found that in both species, learning and choice were affected by uncertainty about reward outcomes (in terms of determining the better option) and by expectation about when the environment may change. However, these effects were mediated through different mechanisms. First, more uncertainty about the better option resulted in slower learning and forgetting in mice, whereas it had no significant effect in monkeys. Second, expectation of block switches accompanied slower learning, faster forgetting, and increased stochasticity in choice in mice, whereas it only reduced learning rates in monkeys. Overall, while demonstrating the usefulness of entropy-based metrics in studying adaptive behavior, our study provides evidence for multiple types of adjustments in learning and choice behavior according to uncertainty in the reward environment.


Fig. 1. Choice task and reward design for studying nutrient influences on monkeys' choices. (A) Nutrient-choice task. Monkeys chose from two sequentially presented options. Conditioned stimuli predicted different liquid rewards; magnitude bars predicted randomly varying reward amounts. (B) Nutrient-reward design. Liquid rewards differed in sugar and fat concentration. LFLS: low-fat, low-sugar; HFLS: high-fat, low-sugar; LFHS: low-fat, high-sugar; and HFHS: highfat, high-sugar. Rewards were matched in flavor (peach of blackcurrant) and other ingredients (protein, salt, etc.); HFLS and LFHS were matched in energy content (isocaloric); HFHS had a higher energy content; and LFLS was lowest in energy content. (C) Completed choice trials per testing session in each animal (N: number of sessions). (D) Choice frequencies for each nutrient reward (± SEM), across sessions and animals (N = 55,205 trials). (E) Choice biases for fat and sugar in single sessions. Trial-by-trial choice records of two representative sessions from monkey Ya choosing between a low-nutrient option (yellow) and rewards with added fat (HFLS, green, Top) or sugar (LFHS, blue, Bottom). Upward/downward bars represent choices for high-/low-nutrient rewards; bar height indicates repeated choice counts. Gray curve shows choice frequency for high-nutrient rewards (seven-trial running average). (F) Nutrient-value functions. Choice frequencies for the low-nutrient reference as a function of offered magnitude ratio (LFLS/high-nutrient rewards ± SEM). Indifference points, estimated by inflection points of fitted sigmoid curves, identify relative values of the high-nutrient rewards, measured on the common scale of the lownutrient reference. (Inset) Relative values of high-fat and high-sugar rewards and their 95% CIs.
Fig. 2. Nutrient-value functions explain monkeys' fat and sugar preferences, predict choices across flavors, and account for individual differences. (A) Nutrient-value functions in three animals across sessions. Choice frequencies for the low-nutrient reference as a function of offered magnitude ratios (error bars: SEM). (Top) Indifference points identify animal-specific subjective values for high-nutrient rewards (relative values and 95% CIs). (B) Out-of-sample validation of subjective values. Relationships between choice probabilities and subjective value differences. Subjective values were derived from animal-specific indifference points in A by transforming offered magnitudes of high-nutrient rewards into value equivalents of the low-nutrient option (10-fold cross validation). (Insets) Adjusted R 2 values from sigmoid fits. (C) Nutrient values across all choice conditions. Mixed-effects logistic regression of monkeys' choices on reward magnitudes (RM) and fat and sugar contents ("nutrient model"), calculated over all sessions. All three animals showed significant coefficients (± SEM) for fat and sugar on choices. (D) Stable fat and sugar effects across testing sessions, suggested by chronological session-wise regression coefficients from mixed-effects nutrient model. (E) Cross-flavor choice prediction. Confusion matrices show cross-validated pseudo-R 2 values obtained by fitting the nutrient regression on choices for one flavor (P: peach, B: blackcurrant) to predict choices for the other flavor. (F) Cross-animal choice prediction. For each animal pair, we used one monkey's nutrient-value function to predict the other monkey's choices. We defined a PDI based on the average log-likelihood ratio of mutual cross-animal predictions (shown as numbers in the triangle plot). Longer triangle sidelengths represent larger discrepancies between animals' nutrient preferences. (G) Fat-sugar contributions in explaining individual differences. Percentage increases in PDIs after independently including fat or sugar regressors into the basic regression for pair-wise cross-animal predictions.
Fig. 3. An oral texture-sensing mechanism mediates the influence of fat on choices. (A) Rheology and tribology measurements. Rheology examines liquid flow to characterize stimulus viscosity; tribology quantifies lubrication and friction between moving surfaces as the CSF. We measured viscosity by testing the liquids in a rotational rheometer (Left). We measured CSF using a custom-designed tribometer (Right) with biological tissue (fresh pig tongues) mimicking oral surfaces. We loaded 30 mL of testing liquid between pig tongues pulled by a slider controlled by an Instron Testing System from tongue base (posterior) toward tongue tip (anterior) at constant velocity (v = 16 mm/s). (B) Sliding friction measurements. Curves show CSFs for liquids with increasing fat content, measured from a single tongue (three repetitions per stimulus). CSF was normalized to the coefficients measured with water. (C) Viscosity and CSF as a function of fat content in our stimulus set for the two flavors (orange: peach and purple: blackcurrant; linear regressions). Measurements are shown for 13 stimuli: the four factorial stimuli with each of the two flavors, diluted cream with each flavor, diluted fruit juices, and water. (D) Mediation analysis. The aggregated influence of nutrient content on choices (total effects, path c) was decomposed into indirect effects mediated by texture (path a, path b) and direct effects (path c'). Mediation effects were significant if texture parameters replaced regression coefficients of the total effects (c ' = c − b). (E) Mediation effects of viscosity and CSF on the influences of fat and sugar on choices. Logistic regressions included fat and sugar regressors ("nutrient") or additional viscosity ("+ Viscosity," brown) or CSF ("+ Friction," orange) regressors. (F) Path diagrams describing correlational relationships between nutrient content, texture parameters, and choices. Because the effect of fat on choices was fully accounted for by CSF (all animals) and viscosity (animal Ya), we included only the direct sugar effect. Significance of path coefficients derived from bootstrap (1,000 iterations).
Fig. 4. Energy maximization does not explain nutrient preferences. (A) Schematic of cumulative choice trajectory between isocaloric high-sugar (LFHS) and high-fat (HFLS) rewards in reward space. (Inset) Proportions of fat and sugar to matched energy content in the two rewards. (B) Cumulative choices between isocaloric high-sugar (LFHS) and high-fat (HFLS) rewards for the three animals (black: mean trajectory of actual choices, gray: single-session trajectories; colors: simulated choice trajectories based on reference strategies maximizing calories, fat, or sugar). All three monkeys' choice trajectories were biased toward highsugar reward. (C) Schematic of cumulative choices in nutrient space polar coordinates, showing dynamic fat-sugar trade-off (slope: relative fat-sugar intake ratio and radius: trial progression). (D) Choice trajectories transformed from reward space into nutrient space (same sessions as in C; black: actual choices; colors: reference simulated choices). (E) Model comparison based on Akaike Information Criteria (AIC) favored the nutrient model with separate fat and sugar regressors over the energy model in explaining reward choices. (F) Higher sensitivities to sugar compared to fat despite matched energy content suggested by differences in logistic-regression coefficients for isocaloric sugar and fat levels (P < 0.001, Wilcoxon signed-rank test).
Fig. 5. Monkeys' preferences for fat and sugar shift their daily nutrient balance away from dietary reference points. (A) Schematic of a mixture triangle (49) that plots nutrient rewards, monkeys' choices, and ecologically relevant reference points in a common space, defined by percentage proportions of fat, sugar, and protein to total energy. (Protein content was constant by design; therefore, the protein axis was similar across stimuli and unlabeled.) Colored circles: offered rewards; red triangle: nutrient balance resulting from monkey's aggregated choices between LFLS and HFLS options (determined by the relative energy intake from each reward); and black markers: reference points. (B) Nutrient balances from monkeys' aggregate choices occupied distinct and stable regions in nutrient space, depending on offered rewards. For choices between low-nutrient reward and high-fat or high-sugar options, nutrient balances were dominated by preferences for fat and sugar. Ellipses: 95% CIs. Inset shows how nutrient balance shifts away from LFLS reference toward high-nutrient stimuli. (C) Comparison with reference points: nutritionally optimal (low-fat, high-protein, and intermediate-carbohydrate) diet composition for adult macaques based on dietary guidelines (black diamond) and macaque milk (black star). Inset shows reference points projected onto the line connecting offered rewards (corresponding to the closest achievable approximations of reference points). (D) Monkeys deviated from optimal diet when choosing between highfat, high-sugar rewards and a low-nutrient option. Histograms of nutrient balances in all three animals deviated from the projected reference point toward higher fat or sugar content (resulting in about 20% more fat or sugar intake than the optimal diet composition).
Preferences for nutrients and sensory food qualities identify biological sources of economic values in monkeys
  • Article
  • Full-text available

June 2021

·

236 Reads

·

16 Citations

Proceedings of the National Academy of Sciences

Significance Preferences for foods high in sugar and fat are near universal and major contributors to obesity. Additionally, human food choices are sophisticated and individualistic: we choose by evaluating a food’s nutrients and sensory features and trading them against quantity and cost. To understand the mechanisms behind human-like food choices, we developed an experimental paradigm in which monkeys chose nutrient rewards offered in varying quantities. Resembling human suboptimal eating, the monkeys’ fat and sugar preferences shifted their nutrient balance away from dietary reference points. Formally defined economic values for specific nutrients and food textures explained the monkeys’ preferences and individual differences. Our findings show how human-like preferences derive from biologically critical food components and open up investigations of underlying neural mechanisms.

Download

Fig. 2. Nutrient-sensitive learning and choice in monkeys. A) Choices and reward outcomes in a single session for monkey Ya (top)
Fig. 3. Nutrient-specific reward and choice histories influence monkeys' choices. A) Choice probabilities for different rewards
Fig. 4. Nutrient-sensitive reinforcement learning models. A) The nutrient-value RL model (NV-RL model). Reward values were
Fig. 5. Dynamics of sugar and fat value components in nutrient-sensitive reinforcement learning. A) Nutrient-specific value
Fig. 6. Neuronal mechanisms for nutrient-sensitive reinforcement learning and choice. A) Nutrient-sensitive reinforcement
Nutrient-sensitive reinforcement learning in monkeys

June 2021

·

81 Reads

Animals make adaptive food choices to acquire nutrients that are essential for survival. In reinforcement learning (RL), animals choose by assigning values to options and update these values with new experiences. This framework has been instrumental for identifying fundamental learning and decision variables, and their neural substrates. However, canonical RL models do not explain how learning depends on biologically critical intrinsic reward components, such as nutrients, and related homeostatic regulation. Here, we investigated this question in monkeys making choices for nutrient-defined food rewards under varying reward probabilities. We found that the nutrient composition of rewards strongly influenced monkeys' choices and learning. The animals preferred rewards high in nutrient content and showed individual preferences for specific nutrients (sugar, fat). These nutrient preferences affected how the animals adapted to changing reward probabilities: the monkeys learned faster from preferred nutrient rewards and chose them frequently even when they were associated with lower reward probability. Although more recently experienced rewards generally had a stronger influence on monkeys' choices, the impact of reward history depended on the rewards' specific nutrient composition. A nutrient-sensitive RL model captured these processes. It updated the value of individual sugar and fat components of expected rewards from experience and integrated them into scalar values that explained the monkeys' choices. Our findings indicate that nutrients constitute important reward components that influence subjective valuation, learning and choice. Incorporating nutrient-value functions into RL models may enhance their biological validity and help reveal unrecognized nutrient-specific learning and decision computations.


Functions of primate amygdala neurons in economic decisions and social decision simulation

April 2021

·

24 Reads

·

20 Citations

Behavioural Brain Research

Long implicated in aversive processing, the amygdala is now recognized as a key component of the brain systems that process rewards. Beyond reward valuation, recent findings from single-neuron recordings in monkeys indicate that primate amygdala neurons also play an important role in decision-making. The reward value signals encoded by amygdala neurons constitute suitable inputs to economic decision processes by being sensitive to reward contingency, relative reward quantity and temporal reward structure. During reward-based decisions, individual amygdala neurons encode both the value inputs and corresponding choice outputs of economic decision processes. The presence of such value-to-choice transitions in single amygdala neurons, together with other well-defined signatures of decision computation, indicate that a decision mechanism may be implemented locally within the primate amygdala. During social observation, specific amygdala neurons spontaneously encode these decision signatures to predict the choices of social partners, suggesting neural simulation of the partner's decision-making. The activity of these 'simulation neurons' could arise naturally from convergence between value neurons and social, self-other discriminating neurons. These findings identify single-neuron building blocks and computational architectures for decision-making and social behavior in the primate amygdala. An emerging understanding of the decision function of primate amygdala neurons can help identify potential vulnerabilities for amygdala dysfunction in human conditions afflicting social cognition and mental health.


Figure 1. Experimental procedure and behavior.
Figure 2. BOLD responses following the revealed preference scheme of two-dimensional indifference curves (ICs).
Figure 2-1. BOLD responses discriminating bundles between indifference curves (ICs) identified with F contrast (map
Figure 3-1. Higher BOLD responses to more preferred (but physically partially dominated) bundles positioned on
Figure 3-2. Bar charts showing neural beta coefficients of regression at peak voxels in ROIs (with ROIs coordinate extracted from GLM1 using leave-one-out procedure) of three brain structures in the population of 24 participants. Each
Single-Dimensional Human Brain Signals for Two-Dimensional Economic Choice Options

February 2021

·

43 Reads

·

10 Citations

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

Rewarding choice options typically contain multiple components, but neural signals in single brain voxels are scalar and primarily vary up or down. In a previous study, we had designed reward bundles that contained the same two milkshakes with independently set amounts; we had used psychophysics and rigorous economic concepts to estimate two-dimensional choice indifference curves (IC) that represented revealed stochastic preferences for these bundles in a systematic, integrated manner. All bundles on the same ICs were equally revealed preferred (and thus had same utility, as inferred from choice indifference); bundles on higher ICs (higher utility) were preferred to bundles on lower ICs (lower utility). In the current study, we used the established behavior for testing with functional magnetic resonance imaging (fMRI). We now demonstrate neural responses in reward-related brain structures of human female and male participants, including striatum, midbrain and medial orbitofrontal cortex that followed the characteristic pattern of ICs: similar responses along ICs (same utility despite different bundle composition), but monotonic change across ICs (different utility). Thus, these brain structures integrated multiple reward components into a scalar signal, well beyond the known subjective value coding of single-component rewards.SIGNIFICANCE STATEMENT:Rewards have several components, like the taste and size of an apple, but it is unclear how each component contributes to the overall value of the reward. While choice indifference curves of economic theory provide behavioural approaches to this question, it is unclear whether brain responses capture the preference and utility integrated from multiple components. We report activations in striatum, midbrain and orbitofrontal cortex that follow choice indifference curves representing behavioral preferences over and above variations of individual reward components. In addition, the concept-driven approach encourages future studies on natural, multi-component rewards that are prone to irrational choice of normal and brain-damaged individuals.


Nonhuman Primates Satisfy Utility Maximization in Compliance with the Continuity Axiom of Expected Utility Theory

February 2021

·

56 Reads

·

13 Citations

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

Expected Utility Theory (EUT), the first axiomatic theory of risky choice, describes choices as a utility maximization process: decision makers assign a subjective value (utility) to each choice option and choose the one with the highest utility. The continuity axiom, central to EUT and its modifications, is a necessary and sufficient condition for the definition of numerical utilities. The axiom requires decision makers to be indifferent between a gamble and a specific probabilistic combination of a more preferred and a less preferred gamble. While previous studies demonstrated that monkeys choose according to combinations of objective reward magnitude and probability, a concept-driven experimental approach for assessing the axiomatically defined conditions for maximizing subjective utility by animals is missing. We experimentally tested the continuity axiom for a broad class of gamble types in four male rhesus macaque monkeys, showing that their choice behavior complied with the existence of a numerical utility measure as defined by the economic theory. We used the numerical quantity specified in the continuity axiom to characterize subjective preferences in a magnitude-probability space. This mapping highlighted a trade-off relation between reward magnitudes and probabilities, compatible with the existence of a utility function underlying subjective value computation. These results support the existence of a numerical utility function able to describe choices, allowing for the investigation of the neuronal substrates responsible for coding such rigorously defined quantity.SIGNIFICANCE STATEMENTA common assumption of several economic choice theories is that decisions result from the comparison of subjectively assigned values (utilities). This study demonstrated the compliance of monkey behavior with the continuity axiom of Expected Utility Theory, implying a subjective magnitude-probability trade-off relation which supports the existence of numerical subjective utility directly linked to the theoretical economic framework. We determined a numerical utility measure able to describe choices, which can serve as a correlate for the neuronal activity in the quest for brain structures and mechanisms guiding decisions.


Figure 2. Empirical indifference curves (IC) representing revealed preferences. (A) Typical convex ICs from an example participant, as seen in 18 of the 24 participants. Component A was a low-sugar high-fat milkshake; component B was a high-sugar low-fat milkshake. Solid lines show hyperbolically fitted ICs, dotted lines show 95% confidence intervals of fits. Dots show bundles that are equally preferred on the same IC (IPs). Inset: psychophysical assessment of indifference point (IP) marked on highest IC (test points in blue, IP estimated by probit regression in red). (B) Typical linear ICs from another example participant, as seen in six of the 24 participants. (C, D) Distributions of slope and curvature, respectively, of hyperbolically fitted ICs from all 24 participants (coefficients 2 / 1 and 3 in Equation 3, respectively). N number of participants. (E) Scheme of intuitive numeric assessment of IC curvature: maximal vertical distance (milliliter of component B on y-axis) between fitted IC (curve) and a straight line connecting the x-axis and y-axis intercepts. A distance of 0.0 ml indicates convexity, whereas a 0.0 ml distance indicates perfectlinearity. (F) Distribution of convex curvature, as measured using the scheme shown in E. The two peaks indicate six participants each with similarly near-linear ICs and similarly convex ICs, respectively. (G) Specificity of bundle choice, as opposed to unrelated parameters. Bar graph shows standardized beta () regression coefficients for choice of Variable Bundle over Reference Bundle (Equation 5), as assessed for each individual participant and then averaged across all 24 participants. RefB component B in Reference Bundle; VarA and VarB components A and B in Variable Bundle; RT, RT VarPos, left-right position of Variable Bundle stimulus; PChoice choice in previous trial. Error bars show standard error of the mean (SEM). p .02. See the online article for the color version of this figure.
Figure 3. Satiety control. (A) Choice function obtained by sigmoid fit with the probit model. From seven randomly selected fixed amounts of component B in the Variable Bundle, the two colored dots show two amounts associated with choice probabilities closest to the indifference point (IP; p .5 each option). (B) Lack of satiety across typical test duration. Choice probabilities varied insignificantly above (blue) and below (orange) 4 IPs on each of 3 ICs (total of 12 choices tested above and 12 choices tested below IP at each of six steps, amounting to 144 choices/ participant). Data are averaged from all 24 participants. Total duration of the six steps was 20 1.38 min (M SEM). See the online article for the color version of this figure.
Figure 4. Characteristics of Becker-DeGroot-Marschak (BDM) bids for bundles at different revealed preference levels. (A) Schematics of positions of bundles used for eliciting BDM bids at psychophysically estimated points of equal revealed preference (indifference points, IPs, connected by dotted lines). Following the schemes of trade-off and revealed preference, BDM bids should be similar for equally valued bundles (along the dotted lines) but higher for bundles farther away from (origin) We tested five bundles per level, three levels, 12 repetitions, total of 180 bids. Inset: BDM task. Each participant bid for the visually presented two-component (A, B) bundle by moving the black dot cursor using the leftward and rightward horizontal arrows on a computer keyboard. Numbers indicate example bids (in U.K. pence). (B) Mean BDM bids from a typical participant. The bids were rank-ordered between increasing revealed preference levels (blue, green, and red; Spearman 0.83, p .001) and differed significantly between levels but not within levels (online Supplemental Materials Table S5). Data are shown as M SEM, N 12 bids per bar. (C) Specificity of monetary BDM bids, as opposed to unrelated parameters. Bar graph showing the standardized beta () regression coefficients for BDM bids (Equation 7), as assessed for each individual participant and then averaged across all 24 participants. PrevLev revealed preference level (low, medium, and high); AmBundle summed currency-adjusted amount of both bundle components; TrialN trial number; PrevBid BDM bid in previous trial; Consum accumulated drinks consumption. Error bars show SEMs. p .020. See the online article for the color version of this figure.
Experimentally Revealed Stochastic Preferences for Multicomponent Choice Options

July 2020

·

64 Reads

·

4 Citations

Journal of Experimental Psychology: Animal Learning and Cognition

Realistic, everyday rewards contain multiple components. An apple has taste and size. However, we choose in single dimensions, simply preferring some apples to others. How can such single-dimensional preference relationships refer to multicomponent choice options? Here, we measured how stochastic choices revealed preferences for 2-component milkshakes. The preferences were intuitively graphed as indifference curves that represented the orderly integration of the 2 components as trade-off: parts of 1 component were given up for obtaining 1 additional unit of the other component without a change in preference. The well-ordered, nonoverlapping curves satisfied leave-one-out tests, followed predictions by machine learning decoders and correlated with single-dimensional Becker-DeGroot-Marschak (BDM) auction-like bids for the 2-component rewards. This accuracy suggests a decision process that integrates multiple reward components into single-dimensional estimates in a systematic fashion. In interspecies comparisons, human performance matched that of highly experienced laboratory monkeys, as measured by accuracy of the critical trade-off between bundle components. These data describe the nature of choices of multicomponent choice options and attest to the validity of the rigorous economic concepts and their convenient graphic schemes for explaining choices of human and nonhuman primates. The results encourage formal behavioral and neural investigations of normal, irrational, and pathological economic choices. (PsycInfo Database Record (c) 2020 APA, all rights reserved).


The Role of the Primate Amygdala in Reward and Decision-Making

May 2020

·

4 Reads

·

1 Citation

The sixth edition of the foundational reference on cognitive neuroscience, with entirely new material that covers the latest research, experimental approaches, and measurement methodologies. Each edition of this classic reference has proved to be a benchmark in the developing field of cognitive neuroscience. The sixth edition of The Cognitive Neurosciences continues to chart new directions in the study of the biological underpinnings of complex cognition—the relationship between the structural and physiological mechanisms of the nervous system and the psychological reality of the mind. It offers entirely new material, reflecting recent advances in the field, covering the latest research, experimental approaches, and measurement methodologies. This sixth edition treats such foundational topics as memory, attention, and language, as well as other areas, including computational models of cognition, reward and decision making, social neuroscience, scientific ethics, and methods advances. Over the last twenty-five years, the cognitive neurosciences have seen the development of sophisticated tools and methods, including computational approaches that generate enormous data sets. This volume deploys these exciting new instruments but also emphasizes the value of theory, behavior, observation, and other time-tested scientific habits. Section editorsSarah-Jayne Blakemore and Ulman Lindenberger, Kalanit Grill-Spector and Maria Chait, Tomás Ryan and Charan Ranganath, Sabine Kastner and Steven Luck, Stanislas Dehaene and Josh McDermott, Rich Ivry and John Krakauer, Daphna Shohamy and Wolfram Schultz, Danielle Bassett and Nikolaus Kriegeskorte, Marina Bedny and Alfonso Caramazza, Liina Pylkkänen and Karen Emmorey, Mauricio Delgado and Elizabeth Phelps, Anjan Chatterjee and Adina Roskies


Scalar human brain responses to vectorial economic choice options: a concept-driven approach

April 2020

·

48 Reads

Rewarding choice options typically contain multiple components, but neural signals in single brain voxels are scalar and primarily vary up or down. In a previous study, we had designed reward bundles that contained the same two milkshakes with independently set amounts; we had used psychophysics and rigorous economic concepts to estimate two-dimensional choice indifference curves (IC) that represented revealed stochastic preferences for these bundles in a systematic, integrated manner. All bundles on the same ICs were equally revealed preferred (and thus had same utility, as inferred from choice indifference); bundles on higher ICs (higher utility) were preferred to bundles on lower ICs (lower utility). In the current study, we used the established behavior for testing with functional magnetic resonance imaging (fMRI). We now demonstrate neural responses in reward-related brain structures of human female and male participants, including striatum, midbrain and medial orbitofrontal cortex that followed the characteristic pattern of ICs: similar responses along ICs (same utility despite different bundle composition), but monotonic change across ICs (different utility). Thus, these brain structures integrated multiple reward components into a scalar signal, well beyond the known subjective value coding of single-component rewards. Significance Statement Rewards have several components, like the taste and size of an apple, but it is unclear how each component contributes to the overall value of the reward. While choice indifference curves of economic theory provide behavioural approaches to this question, it is unclear whether brain responses capture the preference and utility integrated from multiple components. We report activations in striatum, midbrain and orbitofrontal cortex that follow choice indifference curves representing behavioral preferences over and above variations of individual reward components. In addition, the concept-driven approach encourages future studies on natural, multi-component rewards that are prone to irrational choice of normal and brain-damaged individuals.


Figure 1. Experimental design and consistency of choice behavior. (a) Trial sequence. Monkeys chose between two options by moving a cursor (gray dot) to one side of the screen. After a delay, the reward corresponding to the selected cue was delivered. (b) Visual cues indicated magnitude and probability of possible outcomes through horizontal bars' vertical position and width, respectively. (c,d,e) Continuity axiom test. The continuity axiom was tested through choices between a fixed gamble B and a probabilistic combination of A and C (AC). A, B and C were ordered reward magnitudes (c); AC was a gamble between A and C, with probabilities pA and 1-pA respectively (d); different shades of blue correspond to different pA values (darker for higher pA). The continuity axiom implies the existence of a unique AC combination (pA=α) corresponding to choice indifference between the two options (B~AC, vertical line in e), with the existence of a pA for which B≻AC and of a different pA for which AC≻B (vertical dashed lines). The value of α was identified by fitting a sigmoid function (red line) to the proportion of AC choices (blue dots). (f,g,h) Consistency of choice behavior. The standardized beta coefficients from logistic regressions of single trials' behavior (f) showed that the main choice-driving variables were reward magnitude (mR, mL) and probability (pR, pL) for all animals, both for left (L) and right (R) choices; previous trial's chosen side (preChR) and reward (preRewR) did not consistently explain animals' choices (error bars: 95% CI across sessions; * p<0.05, one-sample t test, FDR corrected; no. of sessions per animal: 100 (A), 81 (B), 24 (C), 15 (D)). In choices between options with different probability of delivering the same reward magnitude, the better option was preferred on average by all animals, demonstrating compliance with FSD (g) (error bars: binomial 95% CI; no. of tests per animal: 28 (A), 24 (B), 15 (C), 23 (D); average no. of trials per test: 12 (A), 13 (B), 11 (C), 34 (D)). In choices between sure rewards (bars: average across all sessions; gray dots: single sessions; error bars: binomial 95% CI) animals preferred A to B, B to C and A to C (h), complying with both weak and strong stochastic transitivity (WST: proportion of choices of the better option >0.5 (blue dashed line); SST: proportion of A over C choices (red line) ≥ other choice proportions).
Figure 2. Experimental test of the continuity axiom. (a,b,c) Compliance with the continuity axiom. The axiom was tested through choices between a gamble B and a varying AC combination (left: visual stimuli for an example choice pair with pA=0.5 (a,b) or pA=0.375 (c)); increasing pA values resulted in gradually increasing preferences for the AC option. In each plot, gray dots represent the proportion of AC choices in single sessions, black circles the proportions across all tested sessions with vertical bars indicating the binomial 95% confidence intervals (filled circles indicate significant difference from 0.5; binomial test, p<0.05). The tests were repeated using different A and B values (b) as well as non-zero C values in a modified task (c). All four animals complied with the continuity axiom by showing increasing preferences for increasing probability of gamble A (rank correlation, p<0.05), with the AC option switching from non-preferred (pchoose AC<0.5) to preferred (pchoose AC>0.5) (binomial test, p<0.05). Each IP (α, vertical line) was computed as the pA for which a data-fitted softmax function had a value of 0.5 (horizontal bars: 95% CI); α values shifted coherently with changes in A and B values in all four animals, indicating a continuous magnitude-probability trade-off relation.
Figure 3. Indifference curves in the MP space. (a) Representation of the continuity axiom test in the MP space. The gambles used for testing the axiom can be mapped into the magnitude-probability diagram. Preference in choices between B (circle) and combinations of A and C (graded blue dots) is represented by an arrow pointing in the direction of the preferred option (bottom), consistently with the proportion of choices for the AC option (top). Each continuity axiom test resulted in an IP (vertical line, top), represented as a black dot in the MP space (bottom). (b) Indifference curve. IPs (gray dots: single sessions; black dots: averages; bars: SE) obtained using different A values (step 0.01 ml) shifted continuously, producing an IC in the MP space. Curve: best fitting power function. Data from monkey A (5 sessions, 1781 trials). (c,d) Indifference map. ICs for different B values (colored curves), described the gradual variation of the average IPs (colored dots, with SE bars) for each B. Small dots represent IPs measured in single sessions. Both sure rewards (c) and probabilistic gambles (d) as B options, produced coherent indifference maps, with smooth and non-overlapping ICs.
Compliance with the continuity axiom of Expected Utility Theory supports utility maximization in monkeys

February 2020

·

128 Reads

Expected Utility Theory (EUT), the first axiomatic theory of risky choice, describes choices as a utility maximization process: decision makers assign a subjective value (utility) to each choice option and choose the one with the highest utility. The continuity axiom, central to EUT and its modifications, is a necessary and sufficient condition for the definition of numerical utilities. The axiom requires decision makers to be indifferent between a gamble and a specific probabilistic combination of a more preferred and a less preferred gamble. While previous studies demonstrated that monkeys choose according to combinations of objective reward magnitude and probability, a concept-driven experimental approach for assessing the axiomatically defined conditions for maximizing subjective utility by animals is missing. We experimentally tested the continuity axiom for a broad class of gamble types in four male rhesus macaque monkeys, showing that their choice behavior complied with the existence of a numerical utility measure as defined by the economic theory. We used the numerical quantity specified in the continuity axiom to characterize subjective preferences in a magnitude-probability space. This mapping highlighted a trade-off relation between reward magnitudes and probabilities, compatible with the existence of a utility function underlying subjective value computation. These results support the existence of a numerical utility function able to describe choices, allowing for the investigation of the neuronal substrates responsible for coding such rigorously defined quantity. SIGNIFICANCE STATEMENT A common assumption of several economic choice theories is that decisions result from the comparison of subjectively assigned values (utilities). This study demonstrated the compliance of monkey behavior with the continuity axiom of Expected Utility Theory, implying a subjective magnitude-probability trade-off relation which supports the existence of numerical subjective utility directly linked to the theoretical economic framework. We determined a numerical utility measure able to describe choices, which can serve as a correlate for the neuronal activity in the quest for brain structures and mechanisms guiding decisions.


Citations (49)


... In humans and other primates, OFC activity does reflect an identity-based value signal in orthonasally presented odours 46-48 as well as nutrient-guided valuation of visual food stimuli 49,50 . In addition to these anticipatory cues, the human and primate OFC is also sensitive to consummatory reward features such as taste 16,35 , retronasal odour 35,51 and oral texture 52,53 . Given the established role of OFC neurons in encoding the identity and value of offered and chosen oral food stimuli 17 , as well as our finding of crossmodal decoding in the OFC, our results are in line with the idea that the OFC evaluates an integrated flavour signal from the insula. ...

Reference:

Tastes and retronasal odours evoke a shared flavour-specific neural code in the human insula
A Neural Mechanism in the Human Orbitofrontal Cortex for Preferring High-Fat Foods Based on Oral Texture

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

... Success and failure in these processes have been linked to differential life outcomes and psychiatric conditions. Here we review evidence from single-neuron recordings and neuroimaging studies that implicate the amygdala-a brain structure long associated with cue-reactivity and emotion-in decision-making and the planned pursuit of future rewards (Grabenhorst et al., 2012(Grabenhorst et al., , 2023Hernadi et al., 2015;Zangemeister et al., 2016). The main findings are that, in behavioral tasks in which future rewards can be pursued through planning and stepwise decision-making, amygdala neurons prospectively encode the value of anticipated rewards and related behavioral plans. ...

A view-based decision mechanism for rewards in the primate amygdala
  • Citing Article
  • September 2023

Neuron

... Cross-species research provides invaluable insights, overcoming the conceptual and methodological limitations inherent to human-only or animal-only studies (Polley & Schiller 2022). Recent comparative analyses have begun to unravel the complex neural mechanisms of reward and punishment processing across different species (Bromberg-Martin et al. 2024, Rudebeck & Izquierdo 2022, Wallis 2012, Woo et al. 2023. ...

Mechanisms of adjustments to different types of uncertainty in the reward environment across mice and monkeys

Cognitive Affective & Behavioral Neuroscience

... Parametric variation of non-food reinforcer options could be used to further fractionate rats' preferences and provide more detailed insights into how rats derive utility from engaging with different types of objects. More generally, the free-choice foraging behavioral framework used here could be extended to investigate a wider range of both nutritive (63,64) and non-nutritive (65-68) rewards. ...

Nutrient-Sensitive Reinforcement Learning in Monkeys

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

... Some studies have examined the specific influences of energy, nutrients, and sensory properties on food selection and the control of feeding in RM. As in humans, high fat and/or carbohydrate contents in test foods are reported to be highly palatable and divert RM from nutritional reference values in a manner suggesting that they assigned value to specific nutrients rather than energy intake per se (79). Consistent with Bremer et al.'s (74) results above, RM regulate energy intake through compensating for gastric preloads of macronutrients by decreasing intake at subsequent feeding periods. ...

Preferences for nutrients and sensory food qualities identify biological sources of economic values in monkeys

Proceedings of the National Academy of Sciences

... The amygdala, a cell complex located in the anterior-medial temporal lobe ( Figure 1A), has long been associated with mediating emotional reactions to sensory cues (Rolls, 2000;Baxter and Murray, 2002;Cardinal et al., 2002;Maren and Quirk, 2004;Balleine and Killcross, 2006;Murray, 2007;Ghods-Sharifi et al., 2009;Morrison and Salzman, 2010;Johansen et al., 2011;Janak and Tye, 2015;Gothard, 2020;Pujara et al., 2022). However, recent findings also implicate primate amygdala neurons in more complex cognitive functions, including the pursuit of future rewards through economic, value-based decision-making and planning (Grabenhorst et al., 2012;Hernadi et al., 2015;Grabenhorst et al., 2016;Grabenhorst et al., 2019;Grabenhorst and Schultz, 2021;Grabenhorst et al., 2023). ...

Functions of primate amygdala neurons in economic decisions and social decision simulation
  • Citing Article
  • April 2021

Behavioural Brain Research

... With these properties, the IA provides for a stringent test framework for investigating brain mechanisms of economic choice. So far, human fMRI studies demonstrate subjective value coding in reward-related brain regions, including the ventral striatum, midbrain, amygdala, and orbitofrontal and ventromedial prefrontal cortex (Gelskov et al., 2015;Hsu et al., 2009;Seak et al., 2021;Wu et al., 2011). Neurophysiological studies in monkeys demonstrate the coding of subjective value in midbrain dopamine neurons and orbitofrontal cortex (Kobayashi & Schultz, 2008;Lak et al., 2014;Padoa-Schioppa & Assad, 2006;Stauffer et al., 2014;Tremblay & Schultz, 1999) and formal utility coding in dopamine neurons . ...

Single-Dimensional Human Brain Signals for Two-Dimensional Economic Choice Options

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

... Each person has a particular identity that corresponds to their behavior and serves as a paradigm. Identity utility refers to the change in utility that results from the adaptation of individual behavior to identity norms, and utility maximization is a general and fundamental process that determines the subject's survival (Ferrari-Toniolo et al., 2021). In light of this, we proposed to measure identity salience from the perspective of utility, suggesting that the more utility a role brings to an individual, the higher its salience. ...

Nonhuman Primates Satisfy Utility Maximization in Compliance with the Continuity Axiom of Expected Utility Theory

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

... First, it is assumed that this system learns associations through dopamine mediated reinforcement processes (Ashby & Valentin, 2017). Second, it is assumed that dopamine pathways are relevant for reward processing because they code for unexpected errors (Pastor-Bernier et al., 2020;Schultz, 1999). Because the ALF model updates its coefficients when an error is made, Basal Ganglia circuitry seems like a reasonable biological structure to implement processes like those described in the current work. ...

Experimentally Revealed Stochastic Preferences for Multicomponent Choice Options

Journal of Experimental Psychology: Animal Learning and Cognition

... Our previous work established ICs in rhesus monkeys that represent subjective reward values in an orderly manner and fulfill necessary requirements for rationality, including completeness (preference for one or the other option, or indifference), transitivity, and independence of option set size (Pastor-Bernier et al., 2017). Similar ICs were empirically estimated in humans (Pastor-Bernier et al., 2020). The ICs represent the relative subjective values of the two bundle rewards; thus, important for the present study, IC changes would indicate changes in relative reward value. ...

Experimentally Revealed Stochastic Preferences for Multi-Component Choice Options
  • Citing Article
  • January 2020

SSRN Electronic Journal