ArticlePDF Available

Neural Population Dynamics Underlying Expected Value Computation

Authors:

Abstract and Figures

Computation of expected values (i.e., probability × magnitude) seems to be a dynamic integrative process performed by the brain for efficient economic behavior. However, neural dynamics underlying this computation is largely unknown. Using lottery tasks in monkeys ( Macaca mulatta , male; Macaca fuscata , female), we examined (1) whether four core reward-related brain regions detect and integrate probability and magnitude cued by numerical symbols and (2) whether these brain regions have distinct dynamics in the integrative process. Extraction of the mechanistic structure of neural population signals demonstrated that expected value signals simultaneously arose in the central orbitofrontal cortex (cOFC; medial part of area 13) and ventral striatum (VS). Moreover, these signals were incredibly stable compared with weak and/or fluctuating signals in the dorsal striatum and medial OFC. Temporal dynamics of these stable expected value signals were unambiguously distinct: sharp and gradual signal evolutions in the cOFC and VS, respectively. These intimate dynamics suggest that the cOFC and VS compute the expected values with unique time constants, as distinct, partially overlapping processes. SIGNIFICANCE STATEMENT Our results differ from those of earlier studies suggesting that many reward-related regions in the brain signal probability and/or magnitude and provide a mechanistic structure for expected value computation employed in multiple neural populations. A central part of the orbitofrontal cortex (cOFC) and ventral striatum (VS) can simultaneously detect and integrate probability and magnitude into an expected value. Our empirical study on these neural population dynamics raises a possibility that the cOFC and VS cooperate on this computation with unique time constants as distinct, partially overlapping processes.
Task, behavior, and basic firing properties of neurons. A, Sequence of events during the single-cue task. A single visual pie chart having green and blue pie segments was presented to the monkeys. B, Choice task. Two visually displayed pie charts were presented to the monkeys on the left and right sides of the center. After visual fixation of the reappeared in the central area of the target, the central fixation target disappeared, and monkeys chose either of the targets by fixating on it. A block of the choice trials was sometimes interleaved between the single-cue trial blocks. During the choice trials, neural activity was not recorded. C, Percentages of right target choice during the choice task plotted against the EVs of the left and right target options. Aggregated choice data were used. D, Pseudo-r 2 estimated in the three behavioral models: M1, number of pie segments; M2, probability and magnitude; M3: expected values. E, Percentage of right target choices estimated in each recording session (gray lines) plotted against the difference in expected values (right minus left). The choice data were segmented by seven conditions of the difference in the expected values, as follows: À1.0 to À0.5, À0.5 to À0.3, À0.3 to À0.1, À0.1 to 0.1, 0.1 to 0.3, 0.3 to 0.5, and 0.5 to 1.0. Black plots indicate the mean. F, Reaction time to choose a target option plotted against the difference in expected values (right minus left) as À1.0 to À0.5, À0.5 to À0.3, À0.3 to À0.1, À0.1 to 0.1, 0.1 to 0.3, 0.3 to 0.5, and 0.5 to 1.0. G, An illustration of neural recording areas based on sagittal MR images. Neurons were recorded from the mOFC (14O, orbital part of area 14) and cOFC (area 13 M) at the A31-A34 anterior-posterior (A-P) level. Neurons were also recorded from the DS and VS, respectively, at the A21-A27 level. White scale bar, 5 mm. H, Color map histograms of neuronal activities recorded from the four brain regions. Each horizontal line indicates neural activity aligned to cue onset averaged for all lottery conditions. Neuronal firing rates were normalized to the peak activity. I, Percentages of neurons showing an activity peak during cue presentation. J, Box plots of peak activity latency after cue presentation. K, Firing rates of peak activity observed during cue presentation. L, Box plots of half-peak width, indicating the phasic nature of activity changes. M, Box plots of baseline firing rates during the 1 s time period before the onset of the central fixation target. In J-M, asterisks indicate statistical significance among two neural populations using the Wilcoxon rank-sum test with Bonferroni correction for multiple comparisons [statistical significance: ppp , 0.01, pp , 0.05, and §0.05 , p , 0.06 (close to significance), respectively].
… 
Expected value signals detected by conventional analyses. A, Example activity histogram of a DS neuron modulated by expected value during the single-cue task. The activity aligned to the cue onset is represented for three different levels of probability (0.1-0.3, 0.4-0.7, and 0.8-1.0) and magnitude (0.1-0.3, 0.4-0.7, and 0.8-1.0 ml) of rewards. Gray hatched time windows indicate the 1 s time window used to estimate the neural firing rates shown in B. The neural modulation pattern was defined as the expected value type based on all three analyses (linear regression, AIC-based model selection, and BIC-based model selection). Regression coefficients for probability and magnitude were 6.17 (p , 0.001) and 2.54 (p = 0.007), respectively. B, An activity plot of the DS neuron during the 1 s time window shown in A against the probability and magnitude of rewards. C, D, Same as A and B, but for a VS neuron defined as the expected value type based on all three analyses. Regression coefficients for probability and magnitude were 7.14 (p , 0.001) and 6.71 (p , 0.001), respectively. E, F, Same as A and B, but for a cOFC neuron defined as the expected value type based on all three analyses. Regression coefficients for probability and magnitude were 8.55 (p , 0.001) and 11.1 (p , 0.001), respectively. G, H, Same as A and B, but for an mOFC neuron. The neural modulation pattern was defined as the expected value type based on the AIC-based model selection, as the probability type based on the linear regression, and as the nonmodulated type based on the BIC-based model selection. Regression coefficients for probability and magnitude were 1.76 (p = 0.032) and 0.50 (p = 0.54), respectively. I-L, Plots of regression coefficients for the probability and magnitude of rewards estimated for all neurons in the DS (I), VS (J), cOFC (K), and mOFC (L). Filled colors indicate the neural modulation pattern classified by the BIC-based model selection. P, Probability type; M, magnitude type, EV: Expected value type, and R-R: Risk-Return type. The nonmodulated type is indicated by the small open circle. M-P, Percentages of neural modulation types based on BIC-based model selection through cue presentation in the DS (M), VS (N), cOFC (O), and mOFC (P). The analysis window size is 0.1 s (left), 0.05 s (middle), and 0.02 s (right), respectively.
… 
Content may be subject to copyright.
Systems/Circuits
Neural Population Dynamics Underlying Expected Value
Computation
Hiroshi Yamada,
1,2,3
Yuri Imaizumi,
4
and Masayuki Matsumoto
1,2,3
1
Division of Biomedical Science, Faculty of Medicine, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan,
2
Graduate School of Comprehensive
Human Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan,
3
Transborder Medical Research Center, University of Tsukuba, Tsukuba
305-8577, Ibaraki, Japan, and
4
Medical Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan
Computation of expected values (i.e., probability 3magnitude) seems to be a dynamic integrative process performed by the
brain for efficient economic behavior. However, neural dynamics underlying this computation is largely unknown. Using lot-
tery tasks in monkeys (Macaca mulatta, male; Macaca fuscata, female), we examined (1) whether four core reward-related
brain regions detect and integrate probability and magnitude cued by numerical symbols and (2) whether these brain regions
have distinct dynamics in the integrative process. Extraction of the mechanistic structure of neural population signals demon-
strated that expected value signals simultaneously arose in the central orbitofrontal cortex (cOFC; medial part of area 13)
and ventral striatum (VS). Moreover, these signals were incredibly stable compared with weak and/or fluctuating signals in
the dorsal striatum and medial OFC. Temporal dynamics of these stable expected value signals were unambiguously distinct:
sharp and gradual signal evolutions in the cOFC and VS, respectively. These intimate dynamics suggest that the cOFC and
VS compute the expected values with unique time constants, as distinct, partially overlapping processes.
Key words: computation; expected values; integration; monkey; neural population dynamics; rewards
Significance Statement
Our results differ from those of earlier studies suggesting that many reward-related regions in the brain signal probability
and/or magnitude and provide a mechanistic structure for expected value computation employed in multiple neural populations.
A central part of the orbitofrontal cortex (cOFC) and ventral striatum (VS) can simultaneously detect and integrate probability
and magnitude into an expected value. Our empirical study on these neural population dynamics raises a possibility that the
cOFC and VS cooperate on this computation with unique time constants as distinct, partially overlapping processes.
Introduction
Economic behavior requires a reliable perception of the world
for maximizing benefit (Von Neumann and Morgenstern,
1944;Houthakker, 1950;Samuelson, 1950;Savage, 1954). Such
maximization is primarily achieved by computing expected val-
ues (EVs; i.e., probability multiplied by magnitude) in the brain
(Glimcher et al., 2008), which seems to be a dynamic process
for detecting and integrating probability and magnitude to yield
expected value signals. Indeed, humans and animals behave as
if they compute the expected values in the brain (Kahneman
and Tversky, 1979;Stephens and Krebs, 1986;Glimcher et al.,
2008). One salient example, discovered over a century ago and
repeatedly measured, is human economic behavior, in which a
series of models originating from the standard theory of eco-
nomics (Von Neumann and Morgenstern, 1944) has been
developed to describe efficient economic behavior. Despite the
ubiquity of this phenomenon, a dynamic integrative process to
compute the expected values from probability and magnitude
remains largely unknown.
In the past 2 decades, substantial research in animals has sug-
gested that various brain regions process rewards in terms of sig-
naling probability and/or magnitude, mostly during economic
choice behavior (Platt and Glimcher, 1999;Barraclough et al.,
2004;Tobler et al., 2005;Roesch et al., 2009;Ma and Jazayeri,
Received July 30, 2020; revised Dec. 12, 2020; accepted Dec. 20, 2020.
Author contributions: H.Y. designed research; H.Y. and Y.I. conducted the experiments; H.Y. constructed
analytic tool; H.Y. analyzed the data; H.Y. and M.M. evaluated the analyzed results; H.Y. wrote the
manuscript; All authors edited and approved the final manuscript.
This research was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grants
JP:15H05374, 18K19492, and 19H05007; the Takeda Science Foundation; the Council for Addiction Behavior
Studies; the Narishige Neuroscience Research Foundation; Okinaka Memorial Foundation; The Ichiro Kanehara
Foundation (H.Y.); JSPS KAKENHI Grant JP:26710001; and Ministry of Education, Culture, Sports, Science and
Technology (MEXT) KAKENHI Grant JP:16H06567 (M.M.). We thank Tomomichi Oya, Tomohiko Takei, Tsuyoshi
Setogawa, Jun Kunimatsu, Masafumi Nejime, Narihisa Matsumoto, Hiroshi Abe, and Takashi Kawai for
valuable comments. In addition, we thank Takashi Kawai, Ryo Tajiri, Yoshiko Yabana, and Yuki Suwa for
technical assistance. Finally, we thank the National BioResource Project of the MEXT, Japan, for providing
Monkey FU through the NBRP Japanese Monkeys.
The authors declare no competing financial interests.
Correspondence should be addressed to Hiroshi Yamada at h-yamada@md.tsukuba.ac.jp.
https://doi.org/10.1523/JNEUROSCI.1987-20.2020
Copyright © 2021 Yamada et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0
International license, which permits unrestricted use, distribution and reproduction in any medium provided
that the original work is properly attributed.
1684 The Journal of Neuroscience, February 24, 2021 41(8):16841698
2014;Rudebeck and Murray, 2014;Eshel et al., 2016;Lopatina et
al., 2016;Xie and Padoa-Schioppa, 2016;Yamada et al., 2018).
Among these, expected value computation is assumed to be
processed by neurons in many regions without their neural dy-
namics, which is in line with the expected value theory shared
across multiple disciplines (Von Neumann and Morgenstern,
1944;Stephens and Krebs, 1986;Sutton and Barto, 1998;
Glimcher et al., 2008). Neuroimaging studies in humans and
nonhuman primates also suggest that multiple brain regions in
the reward circuitry (Haber and Knutson, 2010) are involved in
this computational process (ODoherty et al., 2004;Tom et al.,
2007;Hsu et al., 2009;Levy and Glimcher, 2011;Howard et al.,
2015;Howard and Kahnt, 2017;Papageorgiou et al., 2017;
Fouragnan et al., 2019), although the underlying neural mecha-
nism has not been elucidated because of the limited time
resolution of current neuroimaging techniques (Goense and
Logothetis, 2008;Milham et al., 2018). Many brain regions may
employ expected value computation; however, none of these
studies could capture and compare temporal aspects of neural
activities regarding expected value computation in the multiple
candidate brain regions. Thus, we tested the hypothesis that neu-
ral population dynamics within sub-second-order time resolu-
tions (Churchland et al., 2012;Mante et al., 2013;Chen and
Stuphorn, 2015;Murray et al., 2017;Takei et al., 2017)playakey
role in expected value computation, that is, the detection and
integration of probability and magnitude on multiple neural
population ensembles.
We targeted reward-related cortical and subcortical structures
of nonhuman primates (Haber and Knutson, 2010): the central
orbitofrontal cortex [cOFC; the medial part of area 13 (13 M)],
the medial orbitofrontal cortex (mOFC; area 14O), dorsal stria-
tum (DS; the caudate nucleus), and ventral striatum (VS), all of
which represent neural correlates of probability and/or magni-
tude during economic choice behavior. We dissociated the inte-
grative process computing the expected values from a neural
process generating a choice command, which is used during eco-
nomic choices (Chen and Stuphorn, 2015;Rich and Wallis, 2016;
Gardner et al., 2019;Yoo and Hayden, 2020), by recording the
neural activity in a nonchoice situation; monkeys perceive
expected values from a single numerical symbol composed
of probability and magnitude. We then applied a recently
developing mathematical approach, called state space anal-
ysis (Churchland et al., 2012;Mante et al., 2013;Chen and
Stuphorn, 2015;Murray et al., 2017), to the multiple neuro-
nal activities to test how expected value computation is
processed within each of the four neural population ensem-
bles on the order of 10
2
s time resolution. Our findings
suggestthatthecOFCandVSneuralpopulationsemploya
common integrative computation of expected values from
probability and magnitude as distinct and partially overlap-
ping processes.
Materials and Methods
Subjects and experimental procedures
Two rhesus monkeys were used for this study (Macaca mulatta,SUN,
7.1 kg, male; Macaca fuscata, FU, 6.7 kg, female). All experimental proce-
dures were approved by the Animal Care and Use Committee of the
University of Tsukuba (protocol #H30.336) and were performed in com-
pliance with the US Public Health Service Guide for the Care and Use of
Laboratory Animals. Each animal was implanted with a head-restraint
prosthesis. Eye movements were measured using a video camera system
at 120 Hz. Visual stimuli were generated by a liquid crystal display at
60 Hz placed 38 cm from the face of the monkey when seated. The
subjects performed the cued lottery task 5 d/week. The subjects practiced
the cued lottery task for 10months, after which they became proficient
in choosing lottery options.
Experimental design
Behavioral task
Cued lottery tasks. Animals performed one of the following two visu-
ally cued lottery tasks: single-cue task or choice task. The activity of neu-
rons was recorded only during the single-cue task.
Single-cue task. At the beginning of each trial, the monkeys had 2 s
to align their gaze to within 3° of a 1°-diameter gray central fixation tar-
get. After fixating for 1 s, an 8° pie chart providing information about
the probability and magnitude of rewards was presented for 2.5 s at the
same location as the central fixation target. The pie chart was then
removed and 0.2 s later, 1 and 0.1kHz tones of 0.15 s duration indicated
reward and no-reward outcomes, respectively. The high tone preceded a
reward by 0.2 s. The low tone indicated that no reward was delivered.
The animals received a fluid reward, for which magnitude and probabil-
ity were indicated by the green and blue pie charts, respectively; other-
wise, no reward was delivered. An intertrial interval of 46 s followed
each trial.
Choice task. At the beginning of each trial, the monkeys had 2 s to
align their gaze to within 3° of a 1°-diameter gray central fixation target.
After fixating for 1 s, two peripheral 8° pie charts providing information
about the probability and magnitude of rewards for each of the two tar-
get options were presented for 2.5 s, at 8° to the left and right of the cen-
tral fixation location. Gray 1° choice targets appeared at these same
locations. After a 0.5 s delay, the fixation target disappeared, cueing sac-
cade initiation. The animals were free to choose for 2 s by shifting their
gaze to either target within 3° of the choice target. A 1 and 0.1kHz tone
of 0.15 s duration indicated reward and no-reward outcomes, respec-
tively. The animals received a fluid reward indicated by the green pie
chart of the chosen target, with the probability indicated by the blue pie
chart; otherwise, no reward was delivered. An intertrial interval of 46s
followed each trial.
Payoff and block structure. Green and blue pie charts indicated
reward magnitudes from 0.1 to 1.0 ml, in 0.1 ml increments, and reward
probabilities from 0.1 to 1.0, in 0.1 increments, respectively. A total of
100 pie charts was used. In the single-cue task, each pie chart was pre-
sented once in a random order. In the choice task, two pie charts were
randomly allocated to the two options. During one session of electro-
physiological recording, ;3060 trial blocks of the choice task were
sometimes interleaved with 100120 trial blocks of the single-cue task.
Calibration of the reward supply system. The precise amount of liq-
uid reward was controlled and delivered to the monkeys using a solenoid
valve. An 18 gauge tube (inner diameter, 0.9 mm) was attached to the tip
of the delivery tube to reduce the variation across trials. The amount of
reward in each payoff condition was calibrated by measuring the weight
of water with 0.002 g precision (hence, 2 ml) on a single-trial basis. This
calibration method was the same as previously used (Yamada et al.,
2018).
Electrophysiological recordings
We used conventional techniques for recording the single-neuron activ-
ity from the DS, VS, cOFC, and mOFC. Monkeys were implanted with
recording chambers (28 32 mm) targeting the OFC and striatum, cen-
tered 28 mm anterior to the stereotaxic coordinates. The locations of the
chambers were verified using anatomic magnetic resonance imaging. At
the beginning of recording sessions in a day, a stainless steel guide tube
was placed within a 1 mm spacing grid, and a tungsten microelectrode
(13MV; FHC) was passed through the guide tube. To record neurons
in the mOFC and cOFC, the electrode was lowered until it approximated
the bottom of the brain after passing through the cingulate cortex and
dorsolateral prefrontal cortex, or between them. For neuronal recording
in the DS, the electrode was lowered until low spontaneous activity was
observed after passing through the cortex and white matter. For record-
ing in the VS, the electrode was lowered further until it passed through
the internal capsule. At the end of VS recording sessions in a day, the
electrode was occasionally lowered close to the bottom of the brain to
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1685
confirm recording depth relative to the bottom. Electrophysiological sig-
nals were amplified, bandpass filtered, and monitored. Single-neuron ac-
tivity was isolated based on spike waveforms. We recorded from the four
brain regions of a single hemisphere of each of the two monkeys: 194 DS
neurons (monkey SUN, 98; monkey FU, 96), 144 VS neurons (monkey
SUN, 89; and monkey FU, 55), 190 cOFC neurons (monkey SUN, 98;
monkey FU, 92), and 158 mOFC neurons (monkey SUN, 64; monkey
FU, 94). The activity of all single neurons was sampled when the activity
of an isolated neuron demonstrated a good signal-to-noise ratio (.2.5).
Blinding was not performed. The sample sizes required to detect effect
sizes (number of recorded neurons, number of recorded trials in a single
neuron, and number of monkeys) were estimated in reference to previ-
ous studies (Yamada et al., 2013b,2018;Chen and Stuphorn, 2015).
Neural activity was recorded during 100120 trials of the single-cue task.
During choice trials, neural activity was not recorded. Presumed projec-
tion neurons (phasically active neurons; Yamada et al., 2016)were
recorded from the DS and VS, while presumed cholinergic interneurons
(tonically active neurons; Yamada et al., 2004;Inokawa et al., 2020)were
not recorded.
Statistical analysis
For statistical analysis, we used the statistical software package R (http://
www.r-project.org/). All statistical tests for behavioral and neural analy-
ses were two tailed.
In the present study, we used two variables for analyses: probability
and magnitude. We defined the probability of reward from 0.1 to 1.0,
and the magnitude of reward from 0.1 to 1.0 ml. Under this definition of
units, the effects of probability and magnitude on the data were equiva-
lent. Thus, data were not standardized in the analyses.
Behavioral analysis
We examined whether the monkeys choice behavior depended on the
expected values of the two options located on the left and right sides of
the center. We pooled choice data across all recording sessions (monkey
SUN: 884 sessions, 242d; monkey FU: 571 sessions, 127 d), yielding
44,883 and 19,292 choice trials for monkeys SUN and FU, respectively.
A percentage of the right target choices was estimated in the pooled
choice data for all combinations of expected values of the left and right
target options. The percentage of right target choices was also estimated
in each recording session by segmenting the choice data as a function of
the following seven conditions of difference in the expected values (right
minus left) as follows: 1.0 to 0.5, 0.5 to 0.3, 0.3 to 0.1, 0.1 to
0.1, 0.1 to 0.3, 0.3 to 0.5, and 0.5 to 1.0. Reaction times to choose target
options after the appearance of target options were estimated and ana-
lyzed with the expected value differences (right minus left) as follows:
1.0 to 0.5, 0.5 to 0.3, 0.3 to 0.1, 0.1 to 0.1, 0.1 to 0.3, 0.3 to
0.5, and 0.5 to 1.0.
Model fitting
The percentage of choosing the right-side option was analyzed in the
pooled data using a general linear model with binominal distribution:
PchoosesR¼1=ð11ezÞ;(1)
where the relationship between Pchooses
R
and Zwas given by the logis-
tic function in each of the following three models: number of pie seg-
ments (M1), probability and magnitude (M2), and expected values
(M3).
The first model, M1, assumed that the monkeys chose a target by
comparing the number of pie segments for two targets, as follows:
Z¼b01b1NpieL1b2NpieR;(2)
where b
0
is the intercept and Npie
L
and Npie
R
are the number of pie seg-
ments contained in the left and right pie chart stimuli, respectively.
Values of b
0
to b
2
were free parameters and estimated by maximizing the
log likelihood.
The second model, M2, assumed that the monkeys chose a target by
comparing the probability and magnitude of two targets, as follows:
Z¼b01b1PL1b2PR1b3ML1b4MR;(3)
where b
0
is the intercept; P
L
and P
R
are the probability of rewards for left
and right pie chart stimuli, respectively; and M
L
and M
R
are the magni-
tude of rewards for left and right pie chart stimuli, respectively. Values
of b
0
to b
4
were free parameters and were estimated by maximizing the
log likelihood.
The third model, M3, assumed that the monkeys chose a target by
comparing the expected values of rewards for two targets, as follows:
Z¼b01b1EVL1b2EVR;(4)
where b
0
is the intercept and EV
L
and EV
R
are the expected values of
rewards as probability times magnitude for left and right pie chart stim-
uli, respectively. Values of b
0
to b
2
were free parameters and were esti-
mated by maximizing the log likelihood.
Model comparisons
To identify the best structural model to describe the behavior of the
monkeys, we compared the three models described above. In each
model, we estimated a combination of best-fit parameters to explain the
choice behavior of the monkeys. We compared their goodness-of-fit
based on Akaike information criterion (AIC) and Bayesian information
criterion (BIC; Burnham and Anderson, 2004),
AIC ðModelÞ¼2L12k(5)
BIC ðModelÞ¼2L1klogn;(6)
where Lis the maximum log-likelihood of the model, kis the number of
free parameters, and nis the sample size. After estimating the best-fit pa-
rameters in each model, we selected one model that exhibited the small-
est AIC and BIC. To evaluate model fits, we estimated a McFaddens
pseudo-r
2
statistic using the following equation:
Pseudo r2¼ðL0LModel Þ=L0;(7)
where L
Model
is the maximum log likelihood for the model given the
data, and L
0
is the log likelihood under the assumption that all free pa-
rameters are 0 in the model.
Neural analysis
Basic firing properties
Peristimulus time histograms were drawn for each single-neuron activity
aligned at visual cue onset. To display a color map histogram, a peak ac-
tivity (maximum firing rate in each histogram) was detected for each
neuron. The average activity curves were smoothed using a 50ms
Gaussian kernel (
s
= 50 ms) and normalized by the peak firing rates. A
percentage of neurons showing the activity peak during cue presentation
was compared among the four brain regions using a
x
2
test at p,0.05.
Peak firing rates, peak latency, and duration of peak activity (half-peak
width) were compared among the four brain regions using parametric
or nonparametric tests, with a statistical significance level of p,0.05.
Baseline firing rates during 1 s before the appearance of central fixation
targets were also compared with a statistical significance level of
p,0.05.
Estimation of neural firing rates through task trials
We analyzed neural activity during a 2.7 s time period from the onset of
pie chart stimuli to the onset of outcome feedback during the single-cue
task. To obtain a time series of neural firing rates through a trial, we esti-
mated the firing rates of each neuron for every 0.1, 0.05, or 0.02 s time
window (without overlap) during the 2.7 s period. No Gaussian kernel
was used.
1686 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
Estimation of neural firing rates in a fixed time window
We analyzed neural activity during a 1 s time window after the onset of
pie chart stimuli during the single-cue task. The 1 s activity was used for
the conventional analyses below. No Gaussian kernel was used.
Conventional analyses to detect neural modulations in each
individual neuron
Linear regression and model selection
For conventional and standard analyses of neural modulations by the
probability and magnitude indicated by pie chart stimuli, we used linear
regression and model selection analyses. As above, we estimated the fir-
ing rate of each neuron during the 1 s period after the onset of pie chart
stimulus during the single-cue task. No Gaussian kernel was used.
Linear regression
Neural discharge rates (F) were fitted by a linear combination of the fol-
lowing variables:
F¼b01bpProbability 1bmMagnitude;(8)
where Probability and Magnitude are the probability and magnitude of
rewards indicated by the pie chart, respectively. b
0
is the intercept. If b
p
and b
m
were not 0 at p,0.05, discharge rates were regarded as being sig-
nificantly modulated by that variable.
On the basis of the linear regression, activity modulation patterns
were categorized into the following several types: Probabilitytype with
asignificantb
p
and without a significant b
m
;Magnitudetype without a
significant b
p
and with a significant b
m
;Expected valuetype with sig-
nificant b
p
and b
m
with the same sign (i.e., positive b
p
and positive b
m
or
negative b
p
and negative b
m
); Risk-Returntype with significant b
p
and
b
m
with both having opposite signs (i.e., negative b
p
and positive b
m
or
positive b
p
and negative b
m
); and Nonmodulatedtype without signifi-
cant b
p
and b
m
. The Risk-Return types reflect high-risk high return (pre-
fer low probability and large magnitude) or low-risk low return (prefer
high probability and low magnitude).
Model selection
Neural discharge rates, F, were fitted using the following five models:
M1 :F¼b0(9)
M2 :F¼b01bpProbability (10)
M3 :F¼b01bmMagnitude (11)
M4 :F¼b01bpProbability 1bmMagnitude (12)
M5 :F¼b01bev Expected value;(13)
where Expected value is the expected value estimated from the visual pie
chart as probability multiplied by magnitude. b
0
is the intercept.
Probability and Magnitude are the probability and magnitude of reward
indicated by the pie chart, respectively. Among the five models, we
selected one model that exhibited the smallest AIC or BIC.
If the selected model was M1, neurons were defined as the
Nonmodulated type. If the selected model was M2, neurons were
defined as the Probability type. If the selected model was M3, neurons
were defined as the Magnitude type. If the selected model was M4 with
the same signs of b
p
and b
m
, neurons were defined as the Expected value
type. If the selected model was M4 with opposite signs of b
p
and b
m
,neu-
rons were defined as the Risk-Return type. If the selected model was M5,
neurons were defined as the Expected value type.
Conventional analyses through task trials
We applied the three conventional analyses above (linear regression,
AIC-based model selection, and BIC-based model selection) for the
activity of neurons estimated at every time window in the four brain
regions. As above, we estimated the firing rate of each neuron for every
0.1, 0.05, or 0.02 s time window (without overlap) during the 2.7 s pe-
riod. No Gaussian kernel was used. The activity modulation type was
defined in each time window during the 2.7 s period. The analyses
described percentages of neural modulation types throughout cue
presentation.
Population dynamics using principal component analysis
Estimation of neuron firing rates through task trials
As above, we estimated the firing rate in each neuron for every 0.1, 0.05,
or 0.02 s time window (without overlap) during the 2.7 s period. No
Gaussian kernel was used.
Regression subspace
We used linear regression to determine how the probability and magni-
tude of rewards affect the activities of each neuron in the four neural
populations. Each neural population was composed of all recorded neu-
rons in each brain region. We first set the probability and magnitude as
0.11.0 and 0.11.0 ml, respectively. We then described the average fir-
ing rates of neuron iat time tas a linear combination of the probability
and magnitude in each neural population, as follows:
Fði;t;kÞ¼b0ði;tÞ1b1ði;tÞProbabilityðkÞ1b2ði;tÞMagnitudeðkÞ;(14)
where F
(i,t,k)
is the average firing rate of neuron iat time ton trial k,
Probability
(k)
is the probability of reward cued to the monkey on trial k,
and Magnitude
(k)
is the magnitude of reward cued to the monkey on
trial k. The regression coefficients b
0(i,t)
to b
2(i,t)
describe the degree to
which the firing rates of neuron idepend on the mean firing rates
(hence, firing rates independent of task variables), the probability of
rewards, and the magnitude of rewards, respectively, at a given time t
during the trials.
We used the regression coefficients described in Equation 14 to iden-
tify how the dimensions of neural population signals were composed
from the probability and magnitude as aggregated properties of individ-
ual neural activity. This step corresponds to the fundamental conceptual
step of viewing the regression coefficients as a temporal structure of neu-
ral modulation by probability and magnitude at the population level.
Our procedures are analogous to the state space analysis performed by
Mante et al. (2013), in which the regression coefficients were used to
provide an axis (or dimension) of the variables of interest in multidi-
mensional state space obtained by principal component analysis (PCA).
In the present study, our orthogonalized task design allowed us to reli-
ably project neural firing rates into the regression subspace. Note that
our analyses were not aimed at describing the population dynamics of
neural signals as a trajectory in the multidimensional task space, which
is the standard goal of state space analysis.
Principal component analysis
We used PCA to identify dimensions of the neural population signal
in the orthogonal spaces composed of the probability and magnitude
of rewards in each of the four neural populations. In each neural pop-
ulation, we first prepared a two-dimensional data matrix Xof size
N
(neuron)
N
(CT)
; the regression coefficient vectors b
1(i,t)
and b
2(i,t)
in Equation 14, whose rows correspond to the total number of neu-
rons in each neural population and columns correspond to C,the
total number of conditions (i.e., two: probability and magnitude), and
T is the total number of analysis windows (i.e., 2.7 s divided by the
window size). A series of eigenvectors was obtained by applying PCA
once to the data matrix Xin each of the four neural populations. The
principal components (PCs) of this data matrix are vectors v
(a)
of
length N
(neuron)
, the total number of recorded neurons if N
(CT)
is .
N
(neuron)
;otherwise,thelengthisN
(CT)
. PCs were indexed from the
principal components, explaining the most variance to the least var-
iance. The eigenvectors were obtained using the prcomp function in
R software. It must be noted that we did not perform denoising in the
PCA (Mante et al., 2013), since we did not aim to project firing rates
into state space. Instead, we intended to use the PCs to identify the
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1687
main features of neural modulation signals at the population level
through task trials.
Eigenvectors
When we applied PCA to the data matrix X, we could deconstruct the
matrix into eigenvectors and eigenvalues. The eigenvectors and eigenval-
ues exist as pairs with every eigenvector having a corresponding eigen-
value. In our analysis, the eigenvectors at time trepresent a vector in
the space of probability and magnitude. The eigenvalues at time tfor the
probability and magnitude were scalars, indicating the extent of variance
in the data in that vector. Thus, the first PC is the eigenvector with the
highest eigenvalue. We mainly analyzed eigenvectors for the first PC
(PC1) and PC2 in the following analyses. Note that we applied PCA
once to each neural population, and, thus, the total variances contained
in the data were different among the four populations.
Analysis of eigenvectors
We evaluated the characteristics of eigenvectors for PC1 and PC2 in
each of the four neural populations in terms of the vector angle, size, and
deviation in the space of probability and magnitude. The angle is the
vector angle from the horizontal axis from 0° to 360°. The size is the
length of the eigenvector. The deviation is the difference between vec-
tors. We estimated the deviation from the mean vector in each neural
population. These three characteristics of the eigenvectors were com-
pared among the four neural populations at p,0.05, using the Kruskal
Wallis test and the Wilcoxon rank-sum test with Bonferroni correction
for multiple comparisons. The vector during the first 0.1 s was extracted
from these analyses.
Shuffle control for PCA
To examine the significance of population structures described by PCA,
we performed two shuffle controls. When we projected the neural activ-
ity into the regression subspace, data were randomized by shuffling in
two ways. In shuffled condition 1, b
1(i,t)
and b
2(i,t)
in Equation 14 were
estimated with the randomly shuffled allocation of trial number kto the
Probability
(k)
and Magnitude
(k)
only once for all time tin each neuron.
This shuffle provided a data matrix Xof size N
(neuron)
N
(CT)
,elimi-
nating the modulation of probability and magnitude observed in condi-
tion C, but retaining the temporal structure of these modulations across
time. In shuffled condition 2, b
1(i,t)
and b
2(i,t)
in Equation 14 were esti-
mated with the randomly shuffled allocation of trial number kto the
Probability
(k)
and Magnitude
(k)
at each time tin each neuron. This shuf-
fle provided a data matrix Xof size N
(neuron)
N
(CT)
, eliminating the
structure across conditions and times. In these two shuffle controls, ma-
trix Xwas estimated 1000 times. PCA performance was evaluated by
constructing distributions of the explained variances for PC1 to PC4.
The statistical significance of the variances explained by PC1 and PC2
was estimated based on bootstrap SEs (i.e., SD of the reconstructed
distribution).
Bootstrap resampling for onset and peak latencies
To detect the onset and peak latencies of population signals, we analyzed
dynamic changes in the population structure with the size of eigenvector
in each neural population. We used a time series of eigenvectors in 0.02
s analysis windows and estimated the sizes of the time series of vectors
for PC1. To obtain smooth changes in the vector size, a cubic spline
function was applied with a resolution of 0.005 s. Vector sizes during a
0.3 s baseline period were obtained by applying PCA to the matrix data
Xwith time tfrom 0.3 s before cue onset to the onset of feedback (i.e.,
3.0 s time period). An SD of vector sizes during the 0.3 s baseline period
before cue onset was obtained for each neural population. The onset la-
tency of the population signal was defined as the time when the spline
curve was .3 SDs during the baseline period. The peak latency of the
population signal was defined as the time from cue onset to the time
when the maximum vector size was obtained.
We estimated mean latencies of the onset and peak using a paramet-
ric bootstrap resampling method (Efron and Tibshirani, 1993). In each
neural population, the neurons were randomly resampled with a dupli-
cate, and a data matrix Xof size N
(neuron)
N
(CT)
was obtained. The
PCA was applied to the data matrix X. The time series of eigenvectors
was obtained, and their sizes were estimated. The onset and peak laten-
cies were estimated as above. This resampling was conducted 1000 times,
and distributions of the onset and peak latencies were obtained. The sta-
tistical significance of the onset and peak latencies was estimated based
on the bootstrap SEs (i.e., SD of the reconstructed distribution).
Neural population structure with expected value subspace
To include the expected value (i.e., multiplicative integration) directly
into the state space analysis, we used the following regression model,
which described the average firing rates F
(i,t,k)
of neuron iat time tas the
expected value on trial kin each neural population, as follows:
Fði;t;kÞ¼b0ði;tÞ1b3ði;tÞExpected valueðkÞ:(15)
We prepared a two-dimensional data matrix Xof size N
(neuron)
N
(CT)
under three conditions (probability, magnitude, and expected
value); the regression coefficient vectors b
1(i,t)
and b
2(i,t)
,inEquation 14,
and b
3(i,t)
in Equation 15. We applied PCA to the data matrix Xin each
neural population. Note that Equation 15 explains some of the same var-
iances as the neural modulation defined in Equation 14 for each neuron,
but separately used from Equation 14 to project neural activity into the
expected value subspace.
Results
Task and behavior in monkeys
Based on the vast literature on human behavioral economics and
by harnessing the well developed visual and cognitive abilities in
nonhuman primates, we designed a behavioral task in which
monkeys estimated the expected values of rewards from numeri-
cal symbols, mimicking events performed by humans. The task
involved a visual pie chart that included two numerical symbols
associated with the probability and magnitude of fluid rewards
with great precision. After monkeys fixated a central gray target,
a visual pie chart comprising green and blue pie segments was
presented (Fig. 1A). The number of green pie segments indicated
the magnitude of fluid rewards in 0.1 ml increments (0.11.0
ml). Simultaneously, the number of blue pie segments indicated
the probability of reward in 0.1 increments (0.11.0 where 1.0
indicates a 100% chance). After a 2.5 s delay, the visual pie chart
disappeared, and a reward outcome was provided to the mon-
keys with the indicated amount and probability of reward, unless
no reward was given. Under this experimental condition, the
expected values of rewards are defined as the probability multi-
plied by the magnitude cued by the numerical symbols.
To examine whether the monkeys accurately perceived the
expected values from the numerical symbols for probability and
magnitude, we applied a choice task to the monkeys (Fig. 1B).
Analysis of the aggregated choice data indicated that the two
monkeys exhibited nearly efficient performance in selecting a
larger expected value option among two alternatives during
choice trials (Fig. 1C). We examined which of the following three
behavioral models best described the behavior of the monkey, as
follows: M1, monkeys make choices based on the number of pie
segments; M2, monkeys make choices based on the probability
and magnitude; and M3, monkeys make choices based on the
expected value. Comparisons of the model performances based
on AIC and BIC (Burnham and Anderson, 2004)revealedthat
model three best explained the behavior of the monkey, as indi-
cated by the smallest AIC and BIC values (monkey SUN: AIC:
M1, 27,105; M2, 26,895; M3, 21,539; BIC: M1, 27,131; M2, 26,939;
M3, 21,565; monkey FU: AIC: M1, 10,980; M2, 10,889; M3, 9166;
BIC: M1, 11,003; M2, 10,929; M3, 9190). Model three consistently
showed the highest pseudo-r
2
values in each monkey (Fig. 1D).
1688 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
These results indicate that monkeys used the expected values esti-
mated from the numerical symbols for probability and magnitude.
We also evaluated the choice behaviors of the monkey by ana-
lyzing the percentage of choices among two lottery options ses-
sion by session. Each monkey showed a certain variance in the
percent choices over sessions (Fig. 1E, gray), although choices in
each monkey were clearly dependent on the expected value dif-
ference between the two options, without a clear choice-side bias
on average (Fig. 1E, black). In contrast, reaction times to choose
the target option showed a choice-side bias without a consistent
dependence on the expected value differences between the two
monkeys (Fig. 1F). Monkey SUN showed longer reaction times
when the expected values of the left-side options were larger
than those of right-side options, while monkey FU showed
longer reaction times when the expected values of the right-side
options were larger (KruskalWallis test; monkey SUN: n=
44,883, p,0.001, H=4000, df=6; monkey FU: n= 19,292,
p,0.001, H= 1710, df = 6). These results indicate that the
behavior of the monkeys depended to a certain extent on the
expected value difference.
Neural population data
We constructed four pseudo-populations of neurons by record-
ing single-neuron activity during the single-cue task (Fig. 1A)
from the DS (194 neurons), VS (144 neurons), cOFC (190 neu-
rons), and mOFC (158 neurons; Fig. 1G). The four constructed
neural populations exhibited changes in their activities at differ-
ent times in the task trials (Fig. 1H). Approximately 4050% of
neurons in the four neural populations demonstrated peak activ-
ity during cue presentation (Fig. 1I;
x
2
test: n=686, p= 0.32,
x
2
= 3.55, df= 3), with several basic firing properties (Fig. 1JM).
Strong peak activities with short latency were observed in the
cOFC (KruskalWallis test; latency: Fig. 1J,n=314, p=0.013,
H= 10.9, df = 3; peak firing rate: Fig. 1K,n= 314, p,0.001,
H= 32.1, df = 3). Activity changes were slow in the mOFC (Fig.
1L;KruskalWallis test; n=314, p=0.003, H= 13.4, df = 3).
Baseline firing rates were the highest in the cOFC (Fig. 1M;
KruskalWallis test; n= 686, p,0.001, H= 60.3, df = 3). In short,
strong activity with short latency frequently occurred in the
cOFC in contrast to the phasic activity at various latencies in the
DS and VS and the relatively tonic and gradual activity changes
in the mOFC.
Conventional analyses for detecting expected value signals
We first applied common conventional analyses (linear reg-
ression, AIC-based model selection, and BIC-based model
selection) to the four neural populations to examine neural
modulations by probability, magnitude, and expected value at
Figure 1. Task, behavior, and basic firing properties of neurons. A, Sequence of events during the single-cue task. A single visual pie chart having green and blue pie segments was pre-
sented to the monkeys. B, Choice task. Two visually displayed pie charts were presented to the monkeys on the left and right sides of the center. After visual fixation of the reappeared in the
central area of the target, the central fixation target disappeared, and monkeys chose either of the targets by fixating on it. A block of the choice trials was sometimes interleaved between
the single-cue trial blocks. During the choice trials, neural activity was not recorded. C, Percentages of right target choice during the choice task plotted against the EVs of the left and right
target options. Aggregated choice data were used. D, Pseudo-r
2
estimated in the three behavioral models: M1, number of pie segments; M2, probability and magnitude; M3: expected val-
ues. E, Percentage of right target choices estimated in each recording session (gray lines) plotted against the difference in expected values (right minus left). The choice data were seg-
mented by seven conditions of the difference in the expected values, as follows: 1.0 to 0.5, 0.5 to 0.3, 0.3 to 0.1, 0.1 to 0.1, 0.1 to 0.3, 0.3 to 0.5, and 0.5 to 1.0. Black
plots indicate the mean. F, Reaction time to choose a target option plotted against the difference in expected values (right minus left) as 1.0 to 0.5, 0.5 to 0.3, 0.3 to 0.1,
0.1 to 0.1, 0.1 to 0.3, 0.3 to 0.5, and 0.5 to 1.0. G, An illustration of neural recording areas based on sagittal MR images. Neurons were recorded from the mOFC (14O, orbital part of
area 14) and cOFC (area 13 M) at the A31A34 anteriorposterior (AP) level. Neurons were also recorded from the DS and VS, respectively, at the A21A27 level. White scale bar, 5
mm. H, Color map histograms of neuronal activities recorded from the four brain regions. Each horizontal line indicates neural activity aligned to cue onset averaged for all lottery condi-
tions. Neuronal firing rates were normalized to the peak activity. I, Percentages of neurons showing an activity peak during cue presentation. J, Box plots of peak activity latency after cue
presentation. K, Firing rates of peak activity observed during cue presentation. L, Box plots of half-peak width, indicating the phasic nature of activity changes. M, Box plots of baseline fir-
ing rates during the 1 s time period before the onset of the central fixation target. In JM, asterisks indicate statistical significance among two neural populations using the Wilcoxon
rank-sum test with Bonferroni correction for multiple comparisons [statistical significance: ppp,0.01, pp,0.05, and §0.05 ,p,0.06 (close to significance), respectively].
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1689
a single-neuron level (see Materials and Methods). During a
fixed 1 s time window after cue onset, these analyses showed
that neurons in all four brain regions signal probability, mag-
nitude, and expected value to some extent (Fig. 2). For exam-
ple, neurons signaling expected value were found in each
brain region (Fig. 2AH). In addition, neurons signaling prob-
ability or magnitude were also observed in each brain region
(Fig. 2IL, blue, and green). Moreover, a subset of neurons in
the cOFC and VS signaled high risk, high return or low risk,
low return (Fig. 3). These neurons were characterized by a
strong activity, which was elicited when the cue indicated low
probability and large magnitude (hence, high risk, high return;
Fig. 2J,K, brown). Indeed, each neural population was com-
posed of a mixture of these signals (Fig. 2IL), indicating that
signals for the expected value and its components (i.e., proba-
bility and magnitude) appeared in each neural population dur-
ing 1 s after cue onset. Note that the classification of neural
modulation types was dependent on the analysis methods;
however, the overall tendency for differences in neural
modulations among neural populations was consistent among
all three analyses.
We analyzed these neural modulation patterns through a task
trial using these conventional analyses (Fig. 2MP). We found
no significant difference in the proportions of neural modulation
types in the 0.1 s analysis window, except for the VS (
x
2
test: DS:
n= 104, df = 75,
x
2
=91.4,p=0.096; VS: n= 104, df = 75,
x
2
=
98.2, p=0.037; cOFC: n= 104, df = 75,
x
2
=83.2,p=0.242;
mOFC: n=104,df=75,
x
2
=79.0,p= 0.353). Using a finer time
resolution, a 10
2
s time resolution (0.02 s), the detected neural
modulations were proportionally very small because signal-to-
noise ratios generally decrease with the window size. These
observations suggested that conventional analyses provided neu-
ral modulation patterns similar to those of previous studies, but
they did not clearly provide evidence of temporal dynamics in
the modulation patterns of neural populations. Thus, we devel-
oped an analytic tool to examine how the detection and integra-
tion of probability and magnitude are processed within these
neural population ensembles.
Figure 2. Expected value signals detected by conventional analyses. A, Example activity histogram of a DS neuron modulated by expected value during the single-cue task. The activity
aligned to the cue onset is represented for three different levels of probability (0.10.3, 0.40.7, and 0.81.0) and magnitude (0.10.3, 0.40.7, and 0.81.0 ml) of rewards. Gray hatched
time windows indicate the 1 s time window used to estimate the neural firing rates shown in B. The neural modulation pattern was defined as the expected value type based on all three anal-
yses (linear regression, AIC-based model selection, and BIC-based model selection). Regression coefficients for probability and magnitude were 6.17 (p,0.001) and 2.54 (p= 0.007), respec-
tively. B, An activity plot of the DS neuron during the 1 s time window shown in Aagainst the probability and magnitude of rewards. C,D,SameasAand B, but for a VS neuron defined as
the expected value type based on all three analyses. Regression coefficients for probability and magnitude were 7.14 (p,0.001) and 6.71 (p,0.001), respectively. E,F, Same as Aand B,
but for a cOFC neuron defined as the expected value type based on all three analyses. Regression coefficients for probability and magnitude were 8.55 (p,0.001) and 11.1 (p,0.001),
respectively. G,H,SameasAand B, but for an mOFC neuron. The neural modulation pattern was defined as the expected value type based on the AIC-based model selection, as the probability
type based on the linear regression, and as the nonmodulated type based on the BIC-based model selection. Regression coefficients for probability and magnitude were 1.76 (p=0.032)and
0.50 (p= 0.54), respectively. IL, Plots of regression coefficients for the probability and magnitude of rewards estimated for all neurons in the DS (I), VS (J), cOFC (K), and mOFC (L). Filled col-
ors indicate the neural modulation pattern classified by the BIC-based model selection. P, Probability type; M, magnitude type, EV: Expected value type, and R-R: Risk-Return type. The nonmo-
dulated type is indicated by the small open circle. MP, Percentages of neural modulation types based on BIC-based model selection through cue presentation in the DS (M), VS (N), cOFC (O),
and mOFC (P). The analysis window size is 0.1 s (left), 0.05 s (middle), and 0.02 s (right), respectively.
1690 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
State space analysis for detecting neural population
dynamics
State space analysis can provide temporal dynamics of neural
population signal related to cognitive and motor performances
(Churchland et al., 2012;Mante et al., 2013). In our lottery task,
such population dynamics can describe how expected values
evolved within neural population ensembles. To describe how
each neural population detects and integrates probability and
magnitude into the expected value, we represented each neural
population signal as a vector time series in the space of probabil-
ity and magnitude in two steps. First, we used linear regression
to project a time series of each neural activity into a regression
subspace composed of the probability and magnitude in each
neural population. This step captures the across-trial variance
caused by the probability and magnitude moment by moment at
the population level. Second, we applied PCA to the time series
of neural activities in the regression subspace in each neural pop-
ulation. This step determines the main feature of the neural pop-
ulation signal moment by moment in the space of probability
and magnitude. Because activations are dynamic and change
over time, the analysis identified whether and how signal trans-
formations occurred to convert probability and magnitude into
the expected value as a time series of eigenvectors (Fig. 4A). The
directions of these eigenvectors capture the expected values as an
angle moment by moment at the population level (Fig. 4B).
We evaluated eigenvector properties for PC1 and PC2 in each
neural population in terms of vector angle, size, and deviation
(Fig. 4C). A stable population signal is described as a small varia-
tion in eigenvector properties throughout a trial, whereas an
unstable population signal is described as a large variation in
eigenvector properties. It must be noted that our procedure is a
variant of the state space analysis in line with the use of linear
regression to identify dimensions of a neural population signal
(Mante et al., 2013;Chen and Stuphorn, 2015); however, it was
not aimed at projecting the population activity as trajectories in
multidimensional space.
Stable and unstable neural population signals
The eigenvector analyses yielded clear differences in neural pop-
ulation signals among the four populations (Fig. 5AD). We first
confirmed adequate performance of the state space analysis indi-
cated by the percentages of variance
explained in each population (Fig. 5A).
The VS population exhibited the highest
performance among the four neural popu-
lations, followed by the cOFC and DS pop-
ulations, with the lowest performance
exhibited by the mOFC population. Thus,
the performance to process probability
and magnitude information was distinct
among the four neural populations.
To characterize the whole structure of
each neural population signal, we ana-
lyzed the aggregated properties of the
eigenvectors without their temporal order
through a task trial. We first examined
eigenvector properties for PC1. The
aggregated eigenvectors revealed both sta-
ble and unstable neural population signals
during cue presentation (Fig. 5B, green).
The VS population exhibited the highest
performance (37%) with eigenvectors for
PC1 being stable throughout cue presen-
tation, and directions close to 45°, that is,
the expected value (Fig. 5B: VS, vector angle, PC1; mean 6
SEM, 37.5° 60.98, 7.5° difference from 45°). The cOFC popu-
lation also exhibited a stable expected value signal with the sec-
ond-best performance (31%), but they deviated more from the
ideal expected value signal (Fig. 5B: cOFC, vector angle, PC1;
mean 6SEM, 59.4° 61.16, 14.4° difference from 45°;
Wilcoxon rank-sum test, n= 52, W= 122, p,0.001). Vector
stability was the best in the VS and cOFC, as indicated by the
smallest deviation from its mean vector among the four neural
populations (Fig. 5C, left, PC1). Thus, VS and cOFC popula-
tions signaled expected values in a stable manner.
Figure 3. Risk-return signals detected by conventional analyses. A, Example activity histo-
gram of a VS neuron modulated by both probability and magnitude of rewards with opposite
signs (i.e., negative b
p
and positive b
m
). The activity aligned to cue onset is represented for
three different levels of probability (0.10.3, 0.40.7, and 0.81.0) and magnitude (0.10.3,
0.40.7, and 0.81.0 ml) of rewards. Gray hatched areas indicate a 1 s time window to esti-
mate the neural firing rates shown in B. The neural modulation pattern was defined as the
riskreturn type based on the linear regression and AIC-based model selection, and as the
magnitude type based on the BIC-based model selection. Regression coefficients were
2.44 (p= 0.039) and 4.86 (p,0.001) for probability and magnitude, respectively. B,
Activity plots of the VS neuron during the 1 s time window shown in Aagainst the probabil-
ity and magnitude of rewards. C,D, Same as Aand B, but for a cOFC neuron. The neural
modulation type was defined as the riskreturn type based on all three analyses. Regression
coefficients for probability and magnitude were 6.65 (p,0.001) and 3.82 (p,0.001),
respectively.
Figure 4. Schematic depictions for the analysis of neural population dynamics using PCA. A, Time series of a neural popu-
lation activity projected into a regression subspace composed of probability and magnitude. A series of eigenvectors was
obtained by applying PCA once to each of the four neural populations. PC1 and PC2 indicate the first and second principal
components, respectively. The number of eigenvectors obtained by PCA was 2.7 s divided by the analysis window size for
the probability and magnitude: 27, 54, and 135 eigenvectors in a 0.1, 0.05, or 0.02 s time window, respectively. B,Examples
of eigenvectors at time of ith analysis window for probability and magnitude, whose direction indicates a signal characteristic
at the time represented on the population ensemble activity. EV, 45°, 225°; M, magnitude (90°, 270°); P, probability
(0°,180°); R-R, 135°, 315°. C, Characteristics of the eigenvectors evaluated quantitatively. Angle, Vector angle from the hori-
zontal axis taken from 0° to 360°. Size, Eigenvector length; deviation, difference between vectors.
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1691
In contrast, unstable population signals were observed in the
DS and mOFC (Fig. 5B, green). The DS population showed con-
siderable variability in its eigenvectors (Fig. 5C, left, PC1) com-
pared with those in the VS and cOFC neural populations. The
signal carried by the DS neural population was close to 0°, that
is, the probability (Fig. 5B: DS, vector angle, PC1; mean 6
SEM, DS, 11.4° 61.72) with a performance closer to that of
the cOFC (29%). The mOFC population exhibited a large vari-
ability in the eigenvectors (Fig. 5B: mOFC, PC1, vector angle;
mean 6SEM, 38.1° 65.80; Fig. 5C, left, PC1) because of the
poorest performance of PCA (14%), indicating a weak and fluc-
tuating population signal. Thus, neural populations in the DS
and mOFC did not signal expected value through cue presenta-
tion because of the dynamic changes and weakness of the sig-
nals, respectively.
Second, we examined eigenvector properties for PC2. The
eigenvectors for PC2 revealed another feature of neural popula-
tion signal, which reflected riskreturn in the VS and cOFC (Fig.
5B, blue; vector angle, PC2; mean 6SEM, VS, 306.7° 61.07,
8.3° difference from 315°; cOFC, 322.4° 61.94, 7.4° difference
from 315°). The deviations from the ideal riskreturn signal were
not significantly different between the VS and cOFC populations
(Wilcoxon rank-sum test, n=52, W=319, p= 0.737). These sig-
nals were equally stable in the VS and cOFC (Fig. 5C, right,
PC2). In clear contrast, DS and mOFC signals were unstable and
fluctuated more (Fig. 5C, right, vector angle, PC2; mean 6SEM,
DS, 64.8 619.0; mOFC, 320.2 68.77), similar to those observed
for PC1 (Fig. 5C, left, PC1). Thus, the VS and cOFC were key
brain regions to signal riskreturn as well as expected value
within their neural population ensembles, suggesting that
integrated information of the probability and magnitude could
be signaled in these neural populations.
To further examine the significance of these findings, we used
a shuffle control procedure in two ways (see Materials and
Methods). First, we randomly shuffled the allocation of probabil-
ity and magnitude conditions to neural activity in each trial for
each neuron (shuffled condition 1). When we shuffled the linear
projection of neural activity into the regression subspace in this
way, the neural population structure disappeared in all four brain
regions (Fig. 5F). PCA performances for PC1 and PC2 were all
,20% (Fig. 5E) and were significantly reduced from the
observed data in all four brain regions, even in the mOFC (Fig.
6A; explained variance, p,0.001 for all populations in PC1 and
PC2). In addition, because of the shuffle, vector angles for PC1
and PC2 were changed compared with those from the original
data (Fig. 5B,F). Eigenvector deviations under the shuffle control
increased in most cases for PC1 (Fig. 5G; Wilcoxon rank-sum
test, n= 52; PC1, DS, W=237, p=0.027; VS, W=191, p=0.002;
cOFC, W=132, p,0.001; mOFC, W=262, p= 0.078; PC2, DS,
W=352, p=0.837; VS, W=104, p,0.001; cOFC, W=331,
p=0.571;mOFC,W= 189, p= 0.002), with significant differences
among the four neural populations (Fig. 5G;KruskalWallis test;
PC1, n=104, df=3, H=16.4, p,0.001; PC2, n= 104, df = 3,
H= 21.4, p,0.001). This might have occurred because the tem-
poral structure of neural modulation was maintained through a
trial in this shuffled condition 1.
We also tested another shuffle control, in which the trial con-
ditions were shuffled in each analysis window throughout a trial
(shuffled condition 2). Under this full-shuffle control, PCA per-
formances decreased further, albeit slightly (Figs. 5I,6B), without
Figure 5. Neural populations provide stable expected value signals in the VS and cOFC. A, Cumulative variance explained by PCA in the four neural populations. Dashed line indicates percen-
tages of variances explained by PC1 and PC2 in each neural population. B, Overlay plots of series of eigenvectors for PC1 and PC2 in the four neural populations. a.u., Arbitrary unit. C, Box plots
of vector deviation from the mean vector estimated in each neural population for PC1 (left) and PC2 (right). D, Box plots of vector size estimated in each neural population for PC1 (left) and
PC2 (right). EH, Same as AD, but for the PCA under the shuffled condition 1. See Materials and Methods for details. IL, Same as AD, but for the PCA under the shuffled condition 2. In C,
D,G,H,K,andL, asterisks indicate statistical significance between two populations using the Wilcoxon rank-sum test with Bonferroni correction for multiple comparisons [statistical significance
at ppp,0.01, pp,0.05, and §0.05 ,p,0.06 (close to significance), respectively]. The results are shown by using a 0.1 s analysis window.
1692 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
significant differences among the four populations (Fig. 5J,K;
Deviation, KruskalWallis test; PC1, n=104, df=3, H=1.38,
p= 0.71; PC2, n=104, df=3, H=0.53, p=0.91). Vector devia-
tions in this full-shuffle control were clearly larger than those in
the original data without shuffle (Wilcoxon rank-sum test,
n= 52; PC1, DS, W=205, p=0.005; VS, W=112, p,0.001;
cOFC, W=65, p,0.001; mOFC, W= 177, p,0.001; PC2, DS,
W= 310, p= 0.353; VS, W = 117, p,0.001; cOFC, W = 135,
p,0.001; mOFC, W= 238, p= 0.028). In this full-shuffle con-
trol, eigenvectors were directed in various directions compared
with those in the shuffled condition 1 (Fig. 5F,J). Thus, these
shuffle procedures appropriately evaluated the significance of
our population findings.
Next, we examined whether eigenvector size differed among
the four neural populations, which represents the extent of neu-
ral modulation by probability and magnitude in each neural pop-
ulation as an arbitrary unit. The eigenvector size was not
significantly different (Fig. 5D, left; PC1, KruskalWallis test,
n= 104, df = 3, H=2.62, p= 0.45, right; PC2, n= 104, df = 3,
H=4.76, p= 0.19), but it strongly depended on the temporal re-
solution (Fig. 7). The eigenvector size decreased with the analysis
window size (Fig. 7B,E,F), although all the results and conclu-
sions described above were maintained across the window sizes
(Fig. 7AD). The decrease in the eigenvector size could be
because signal-to-noise ratios generally decrease when the win-
dow size decreases. These effects were observed as a decrease in
PCA performance (Fig. 7A) and percentages of neural modula-
tions in the conventional analyses (Fig. 2MP). Note that we did
not find any significant difference in the vector size compared
with shuffle controls in each neural population (Fig. 5D,H,L;p.
0.05 for all cases).
Collectively, these observations suggest a possibility that the
probability and magnitude of rewards could be detected and
integrated within the activity of the cOFC and VS neural popula-
tions as the expected value and riskreturn signals in a stable
state, at least considering the four brain regions that have been
thought as key components of the reward system of the brain.
Temporal structure of neural population signals
Although stable signals were observed in the cOFC and VS neu-
ral populations above, the extent of neural modulations changed
throughout a trial (Fig. 8). To characterize temporal aspects of
the VS and cOFC neural populations that yield expected value
signals, we first compared temporal dynamics of all four neural
population signals at the finest time resolution. Specifically, we
compared the temporal patterns of vector changes exhibited by
each neural population (Fig. 9). At the time point after cue onset
when monkeys initiated the expected value computation, all four
neural populations developed eigenvectors (Fig. 9A). The eigen-
vector size increased and then decreased within a second; how-
ever, the temporal patterns of this size change were different
among the four neural populations. The onset latencies,
detected by comparing to the vector size during the baseline pe-
riod, seemed to be coincident for the cOFC, VS, and DS popu-
lations, followed by a late noisy signal in the mOFC (Fig. 9B).
In contrast, the detected peak of vector size for each neural pop-
ulation seemed to appear at different times. To statistically
examine these temporal dynamics at the population level, we
used a bootstrap resampling technique (see Materials and
Methods).
The analysis revealed no significant difference in onset
latencies among the cOFC, VS, and DS populations (Fig. 9C;
bootstrap resampling, onset latency, mean 6SD; cOFC,
107.1 626.0 ms; VS, 138,7 661.3 ms; DS, 155.0 652.4 ms),
while these signals were followed by a late noisy signal in the
mOFC (287 698.8 ms). In contrast, when we compared peak
latencies (Fig. 9D), the cOFC exhibited the earliest peak
(292 637.5 ms), followed by the DS (371 643.0 ms), the
mOFC (444 6113.5 ms), and the VS (508 676.7 ms), which
exhibited the latest peak. Thus, the expected value signal
sharply developed in the cOFC in contrast to the gradual devel-
opment in the VS. mOFC signals were very noisy, as indicated
by the large variation in the vector size during the baseline pe-
riod (Fig. 9B, bottom; see horizontal line).
We also examined temporal changes in vector angles,
which indicate how fast the stable expected value signals were
evoked in the cOFC and VS (Fig. 9E). As observed in the time
series of vector angles after detected onsets, signals carried by
the VS and cOFC neural populations during the early time pe-
riod were almost 45° (i.e., expected value), indicating that
these two neural populations integrate probability and magni-
tude information into expected value just after the appearance
of the numerical symbol (see intercepts of regression lines).
Moreover, these two expected value signals were not the same,
but rather were idiosyncratic in each neural population: a
gradual and slight shift of the vector angle directed to 90° (i.e.,
magnitude; cOFC, Fig. 9E; regression coefficient, r=5.31,
n= 129, t= 6.04, df = 126, p,0.001) or 0° (i.e., probability;
Fig. 9E;VS,r=3.91, n= 127, t=4.16, df = 124, p,0.001)
was observed toward the end of cue presentation. Similar to
the VS population, the DS population showed the same
Figure 6. Probability density of explained variances by PCA in shuffled controls. A,
Probability density of variances explained by PCA for PC1 to PC4 under the shuffled condition
1 (for details, see Materials and Methods). The probability density was estimated with 1000
repeats of the shuffle in each neural population. B, Probability density of variance explained
by PCA for PC1 to PC4 under the shuffled condition 2 (for details, see Materials and
Methods). The probability density was estimated with 1000 repeats of the shuffle in each
neural population. In Aand B, dashed lines indicate the variances explained by PCA in each
of the four neural populations without the shuffle. The results are shown by using 0.1 s anal-
ysis window.
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1693
tendency as the angle shift (Fig. 9E;
DS, r=5.38, n= 127, t=3.31,
df = 124, p= 0.001). In contrast, a
significant shift in vector angle was
not observed in the mOFC population
(r=4.30, n= 120, t=0.94, df = 117,
p= 0.351). The signals observed in the
DS and mOFC populations immediately
after cue presentation were relatively
close to the expected value; however, they
quickly disappeared (Fig. 9E). These
results suggest that the neural popula-
tions in both the VS and cOFC integrate
probability and magnitude information
into expected value immediately after cue
presentation, despite their temporal dy-
namics being idiosyncratic for each of
the two stable population signals.
Neural population structure with
multiplicative integration of
probability and magnitude
We detected the expected value signals
in the VS and cOFC as a particular
Figure 7. Effects of the analysis window size on the PCA. A, Cumulative variances explained by PCA in the four neural populations. Dashed lines indicate the percentages of variance
explained by PC1 and PC2 in each neural population. The sizes of the analysis window are 0.1, 0.05, and 0.02 s, respectively. B, Overlay plots of series of eigenvectors in the four neural popula-
tions. Eigenvectors for PC1 and PC2 are shown. The analysis window size is 0.1, 0.05, and 0.02 s, respectively. a.u., Arbitrary units. C, Box plots of vector deviation from the mean vector esti-
mated in each neural population are shown for the PC1. D,SameasC, but for the PC2. E, Box plots of vector size estimated in each neural population are shown for the PC1. F,SameasE,but
for the PC2. In CF, asterisks indicate statistical significance between two neural populations using Wilcoxon rank-sum test with Bonferroni correction for multiple comparisons [statistical signif-
icance at ppp,0.01, pp,0.05, and §0.05 ,p,0.06 (close to significance), respectively].
Figure 8. Neural modulation patterns as regression coefficients in four neural populations. Plots of regression coefficients for
the probability and magnitude of rewards estimated for all neurons in the DS, VS, cOFC, and mOFC. Regression coefficients
when using a 0.1 s analysis window are shown every 0.5 s (00.1, 0.50.6, 1.01.1, 1.51.6, 2.02.1, and 2.52.6 s).
1694 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
vector angle defined as a linear combination of probability
and magnitude in their regression subspace above. This orig-
inal state space analysis could not differentiate whether neu-
ral populations use linear or multiplicative integration,
although the expected values assume a multiplicative combi-
nation of probability and magnitude, mathematically. Last,
we examined whether these neural populations use multipli-
cative integration by performing an additional state space
analysis, which determines whether the original neural popu-
lation structure, represented as a linear combination of prob-
ability and magnitude, is unaffected by the existence of
multiplicative integration (see Materials and Methods).
Performance of the additional state space analysis in each
population was similar to that in the original analysis (Figs.
5A,10A). Slight increases in explained variance were
observed for PC1 and PC2 (,10% in the cOFC and DS), sug-
gesting that the neural populations in the VS and cOFC may
be similarly explained by linear and multiplicative integration.
The neural population structure represented as eigenvectors
was consistently observed in the VS (Fig. 10B, left). PC1 and PC2
signaled expected value (left, green) and riskreturn (left, blue),
as observed in the original analysis (Fig. 5B). Eigenvector direc-
tions for PC2 were flipped compared with the original ones, pos-
sibly because changes in coordinate transformation by including
the expected value subspace can affect polarity determination in
the component plane. Note that eigenvectors evolved after cue
presentation (Fig. 10B, labeled with s) and developed toward
the end of cue presentation (Fig. 10B, labeled with e)consistent
with those in the original analysis (Fig. 9A). In contrast, the pre-
dominant eigenvectors were changed in the cOFC (Fig. 10B,
right). Eigenvectors for both PC1 and PC2 were directed to the
expected value by complimenting with each other (i.e., 45° and
225°), while the riskreturn signal decreased from PC2 to PC3.
This may be because a considerable degree of variance
unexplained in the original analysis was added by including the
expected value into the regression subspace in the cOFC. These
results suggest that using linear or multiplicative integration
resulted in somewhat different stable neural population struc-
tures in the cOFC.
Discussion
Extraction of neural population dynamics is a recently develop-
ing approach for understanding computational processes imple-
mented in the domain of cognitive and motor processing
(Churchland et al., 2012;Mante et al., 2013;Chen and Stuphorn,
2015;Murray et al., 2017;Takei et al., 2017). This approach
provides a mechanistic structure of neural population signals
Figure 9. Gradual and sharp evolutions of neural population signals in the VS and cOFC. A, Plots of eigenvector time series for PC1 in 0.02 s analysis windows shown in a sequential order
during 1 s after cue onset. Horizontal and vertical scale bars indicate the eigenvectors for probability and magnitude in arbitrary units, respectively. B, Plots of the time series of vector size dur-
ing 1 s after cue onset. Horizontal dashed lines indicate 3 SDs of the mean vector size during the baseline period, a 0.3 s time period before cue onset. Solid colored lines indicate interpolated
lines using a cubic spline function to provide a resolution of 0.005 s. Vertical dashed lines indicate the onset (left) and peak (right) latencies for changes in vector sizes. C, Probability densities
of onset latencies for the four neural population signals. Probability densities were estimated using bootstrap resamplings. Vertical dashed lines indicate means. Horizontal solid lines indicate
bootstrap SEs. D,SameasC, but for peak latencies of the four neural population signals. E, Plots of time series of vector angle from the detected onset to the onset of outcome feedback. Solid
black lines indicate regression slopes. In Cand D, asterisks indicate statistical significance estimated using bootstrap resamplings (statistical significance at pppp,0.001 and pp,0.05,
respectively). In E, triple asterisks indicate a statistical significance of the regression slope at p,0.001. Data for PC2 are not shown.
Figure 10. Neural population structures of the VS and cOFC with multiplicative integration
of probability and magnitude. A, Cumulative variance explained by PCA in the four neural
populations when the state space analysis was performed with the expected value into the
regression matrix. Dashed line indicates the percentage of variances explained by PC1 and
PC2 in each neural population. B, Plots of time series of eigenvectors connected with lines
for PC1 to PC3 in the VS and cOFC. Eigenvectors during cue presentation were presented
from the beginning to the end using a 0.1 s analysis window. Plots at the beginning and
end are filled in black and labeled as start (s) and end (e), respectively. a.u., Arbitrary unit.
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1695
regarding temporal aspects, such as oscillatory activities during
reaching (Churchland et al., 2012), coactivation patterns of spi-
nal neurons and muscles (Takei et al., 2017), and dynamic
unfolding of task-related activity during perceptual decisions
(Mante et al., 2013). Here, we found that the VS and cOFC neu-
ral populations maintain the stable expected value signals at the
population level (Fig. 5). This is the first mechanistic demonstra-
tion of expected value signals embedded in multiple neural
populations when monkeys computed expected values from nu-
merical symbols cueing the probability and magnitude of
rewards. The temporal dynamics of these two stable neural pop-
ulations are unique in the aspect of time constants (Fig. 9BD)
and gradual shifts of their structures (Fig. 9E). These results sug-
gest that cOFC and VS compute expected values as distinct, par-
tially overlapping processes. If monkeys are required to make an
economic choice, these expected value computations must be fol-
lowed by comparison and choice processes employed by the
same or downstream brain regions (Raghuraman and Padoa-
Schioppa, 2014;Chen and Stuphorn, 2015;Zhou et al., 2019;
Yoo and Hayden, 2020).
Two idiosyncratic expected value signals in the cOFC and VS
State space analysis can detect both stable (Murray et al., 2017)
and flexible (Mante et al., 2013) neural signals at the population
level. In the present study, the expected value signals observed in
the VS and cOFC were similarly stable in terms of vector angle
fluctuation but significantly different in temporal aspects (Fig. 9).
These signal properties indicate that information processing in
these two brain regions was not the same. For example, the fast
cOFC signal may reflect the calculation of expected values from
the probability and magnitude symbols, such as mental arithme-
tic, while the slow VS signal may reflect a secondary process to
maintain the calculated expected value information. It is also pos-
sible that the fast cOFC signal may have reflected expected value
signals integrated elsewhere (e.g., the amygdala). It is known that
the frontostriatal projection plays a large role in a variety of cogni-
tive functions anatomically (Alexander et al., 1986;Haber and
Knutson, 2010). Since the cOFC projects to the VS, these two
processes must act cooperatively through the cortico-basal ganglia
loop. Indeed, these population signals were similar in terms of the
heterogeneous signals carried by each individual neuron (Fig.
2J,K) throughout the task trial (Fig. 2N,O). However, these two
expected value signals were unambiguously distinctive in terms of
theirtimecourse(
Fig. 9BD)andgradualshift(Fig. 9E).
Therefore, the cOFC and VS may compute expected values within
each cortical and striatal local circuits in a cooperative manner.
Our results are consistent with those of human imaging stud-
ies, in which the activity in the VS and cOFC represented value-
related signals (ODoherty et al., 2004;Yan et al., 2016;Noonan
et al., 2017), but not with the evidence that value signals exist in
the human ventromedial prefrontal cortex (vmPFC; Tom et al.,
2007;Levy and Glimcher, 2011), which includes the mOFC. The
reasons for why the mOFC showed very weak signals related to
all aspects of expected value (Figs. 2L,5B) is unclear. One possi-
bility for this inconsistency may be interspecific differences
between human and nonhuman primates in the orbitofrontal
network (Wallis, 2011). The mOFC is a part of the vmPFC, but
the comparison between human and macaque monkeys remains
elusive. Another possibility is that the vmPFC is not involved in
simple information processing, such as the association between
cues and outcomes, but is involved in more complicated behav-
ioral contexts for making economic decisions (Yamada et al.,
2018) and setting of mood (Ongür and Price, 2000).
Fluctuating signals in the DS and mOFC
Fluctuating signals were observed in the DS and mOFC because
of the instability or weakness of the signals (Fig. 5). The mOFC
signal would not be completely meaningless, since the PCA per-
formance in the mOFC population was better than in shuffle
controls (Fig. 6). However, the signal carried by the mOFC pop-
ulation was weak (Fig. 2L), indicating that the eigenvector fluctu-
ation in the mOFC population reflects weak signal modulations
by probability and magnitude. In contrast, PCA performance in
the fluctuating DS population was equivalent to that in the cOFC
population (Fig. 5A), where a stable expected value signal
appeared. Moreover, considerable modulation of DS neural ac-
tivity was observed in conventional analyses (Fig. 2I,M). Thus,
the fluctuating DS signal must reflect a functional role played by
the DS neural population in detecting and integrating probability
and magnitude, which is related to some controls of actions
(Balleine et al., 2007). The DS signal fluctuated with a significant
shift directing probability, but the initial signal was relatively
close to expected values (Fig. 9E, top), which is similar to the in-
stantaneous expected value signals observed in the mOFC (Fig.
9E, bottom). These observations imply that the expected value
computations might be distributed in the reward circuitry. The
consistent direction of the shift between VS and DS populations
implies that striatal neural populations may prefer probabilistic
phenomena (Pouget et al., 2013;Ma and Jazayeri, 2014), whereas
the cOFC neural population may prefer magnitude, which is a
continuous variable.
Expected value signals and economic choices
Economic choices seem to be composed of a series of processes,
such as expected value computation, followed by value compari-
son, and then choice among options. Recent findings suggest
that these computations may or may not be discrete/continuous
and could overlap (Chen and Stuphorn, 2015;Yoo and Hayden,
2020). Because we used a single-cue task, the observed signals
solely reflect the integration of probability and magnitude. In the
last 2 decades, neural correlates of probability and/or magnitude
have been extensively reported in a diverse set of brain regions
(ODoherty, 2014), mostly during economic choice tasks without
reflecting on their underlying dynamics. These distributed sig-
nals may support the possibility that expected value computation
occurs in wider brain regions as a network, although they are
likely to reflect an array of alternative non-value-related proc-
esses (ODoherty, 2014), such as motor responses and choice
processes. Although signals in the DS and mOFC fluctuated (Fig.
5B), they were relatively close to expected values at the beginning
of cue presentation (Fig. 9A,E), suggesting that widespread evo-
lution of expected value signals might occur through a reward
circuitry at the beginning when monkeys process the integration.
Significance of population signals revealed by our analysis
State space analysis reveals temporal structures of neural popu-
lations in multidimensional space for both cognitive tasks
(Murray et al., 2017) and motor tasks (Churchland et al., 2012;
Takei et al., 2017). However, interpretation of the extracted
population structure depends on the method used (Elsayed and
Cunningham, 2017). In the present study, we did not seek to
determine the population structure as a trajectory in neural
state space, as performed in previous studies. Instead, we aimed
to detect the main features underscoring the population struc-
ture in the space of probability and magnitude that compose
expected value. For this purpose, stability of the regression sub-
space is critical. We elaborately projected neural firing rates
1696 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
into the regression subspace by preparing a completely orthog-
onal data matrix in our task design. Moreover, two shuffled
controls revealed the significance of our state space analysis. In
the full-shuffled control, eigenvectors directed all dictions,
because neural modulation structures were entirely destroyed
(Fig. 5J). In the partially shuffled control (condition 1), the
maintained temporal structure occasionally yields some subtle
modulation structures through a trial because of the random
allocation of neural activity to probability and magnitude (Fig.
5F). Thus, our state space analysis is informative on whether
and how expected value signals are composed of the probability
and magnitude moment by moment as a series of eigenvectors.
Conclusions
A dynamic integrative process of probability and magnitude is
the basis for the computation of expected values in particular
brain regions (i.e., the cOFC and VS). The existence of neural
population signals for expected values is consistent with the
expected value theory, whereas the coexistence of risk signals,
which has been shown (ONeill and Schultz, 2010)withreturns
(Figs. 3,5B), may reflect a behavioral bias for risk preferences, a
phenomenon observed across species (Stephens and Krebs, 1986;
Yamada et al., 2013a). The sharp and slow evolution of expected
value signals in the cOFC and VS, respectively, suggests that each
brain region has a unique time constant in the expected value
computation. When monkeys perceive probability and magni-
tude from numerical symbols, learned expected values may be
computed and recalled through the OFCstriatum circuit
(Hirokawa et al., 2019), along with other networks that may also
instantaneously process this computation. Our results indicate
that the expected value signals observed in population ensemble
activities are compatible with the framework of dynamic systems
(Churchland et al., 2012;Mante et al., 2013).
References
Alexander GE, DeLong MR, Strick PL (1986) Parallel organization of func-
tionally segregated circuits linking basal ganglia and cortex. Annu Rev
Neurosci 9:357381.
Balleine BW, Delgado MR, Hikosaka O (2007)The role of the dorsal striatum
in reward and decision-making. J Neurosci 27:81618165.
Barraclough DJ, Conroy ML, Lee D (2004) Prefrontal cortex and decision
making in a mixed-strategy game. Nat Neurosci 7:404410.
Burnham K, Anderson D (2004) Multimodel inference: understanding AIC
and BIC in model selection. Sociol Method Res 33:261304.
Chen X, Stuphorn V (2015) Sequential selection of economic good and
action in medial frontal cortex of macaques during value-based decisions.
Elife 4:e09418.
Churchland MM, Cunningham JP, Kaufman MT, Foster JD, Nuyujukian P,
Ryu SI, Shenoy KV (2012) Neural population dynamics during reaching.
Nature 487:5156.
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. New York:
Chapman and Hall.
Elsayed GF, Cunningham JP (2017) Structure in neural population record-
ings: an expected byproduct of simpler phenomena? Nat Neurosci
20:13101318.
Eshel N, Tian J, Bukwich M, Uchida N (2016) Dopamine neurons share com-
mon response function for reward prediction error. Nat Neurosci
19:479486.
Fouragnan EF, Chau BKH, Folloni D, Kolling N, Verhagen L, Klein-Flügge
M, Tankelevitch L, Papageorgiou GK, Aubry JF, Sallet J, Rushworth MFS
(2019) The macaque anterior cingulate cortex translates counterfactual
choice value into actual behavioral change. Nat Neurosci 22:797808.
Gardner MPH, Conroy JC, Sanchez DC, Zhou J, Schoenbaum G (2019) Real-
time value integration during economicchoice is regulated by orbitofron-
tal cortex. Curr Biol 29:43154322.e4.
Glimcher PW, Fehr E, Camerer C, Poldrack RA (2008) Neuroeconomics: de-
cision making and the brain. New York: Elsevier.
Goense JB, Logothetis NK (2008) Neurophysiology of the BOLD fMRI signal
in awake monkeys. Curr Biol 18:631640.
Haber SN, Knutson B (2010) The reward circuit: linking primate anatomy
and human imaging. Neuropsychopharmacology 35:426.
Hirokawa J, Vaughan A, Masset P, Ott T, Kepecs A (2019) Frontal cortex neu-
ron types categorically encode single decision variables. Nature 576:446451.
Houthakker HS (1950) Revealed preference and the utility function.
Economica 17:159174.
Howard JD, Kahnt T (2017) Identity-specific reward representations in orbi-
tofrontal cortex are modulated by selective devaluation. J Neurosci
37:26272638.
Howard JD, Gottfried JA, Tobler PN, Kahnt T (2015)Identity-specific coding
of future rewards in the human orbitofrontal cortex. Proc Natl Acad Sci
USA112:51955200.
Hsu M, Krajbich I, Zhao C, Camerer CF (2009) Neural response to reward antici-
pation under risk is nonlinear in probabilities. J Neurosci 29:22312237.
Inokawa H, Matsumoto N, Kimura M, Yamada H (2020) Tonically active
neurons in the monkey dorsal striatum signal outcome feedback during
trial-and-error search behavior. Neuroscience 446:271284.
Kahneman D, Tversky A (1979) Prospect theory: an analysis of decisions
under risk. Econometrica 47:263292.
Levy DJ, Glimcher PW (2011) Comparing apples and oranges: using reward-
specific and reward-general subjective value representation in the brain. J
Neurosci 31:1469314707.
Lopatina N, McDannald MA, Styer CV, Peterson JF, Sadacca BF, Cheer JF,
Schoenbaum G (2016) Medial orbitofrontal neurons preferentially signal
cues predicting changes in reward during unblocking. J Neurosci
36:84168424.
Ma WJ, Jazayeri M (2014) Neural coding of uncertainty and probability.
Annu Rev Neurosci 37:205220.
Mante V, Sussillo D, Shenoy KV, Newsome WT (2013) Context-dependent
computation by recurrent dynamics in prefrontal cortex. Nature 503:78
84.
Milham MP, Ai L, Koo B, Xu T, Amiez C, Balezeau F, Baxter MG, Blezer
ELA, Brochier T, Chen A, Croxson PL, Damatac CG, Dehaene S,
Everling S, Fair DA, Fleysher L, Freiwald W, Froudist-Walsh S, Griffiths
TD, Guedj C, et al. (2018) An open resource for non-human primate
imaging. Neuron 100:6174.e2.
Murray JD, Bernacchia A, Roy NA, Constantinidis C, Romo R, Wang XJ
(2017) Stable population coding for working memory coexists with heter-
ogeneous neural dynamics in prefrontal cortex. Proc Natl Acad Sci U S A
114:394399.
Noonan MP, Chau BKH, Rushworth MFS, Fellows LK (2017) Contrasting
effects of medial and lateral orbitofrontal cortex lesions on credit assign-
ment and decision-making in humans. J Neurosci 37:70237035.
ODoherty JP (2014) The problem with value. Neurosci Biobehav Rev
43:259268.
ODoherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004)
Dissociable roles of ventral and dorsal striatum in instrumental condi-
tioning. Science 304:452454.
ONeill M, Schultz W (2010) Coding of reward risk by orbitofrontal neurons
is mostly distinct from coding of reward value. Neuron 68:789800.
Ongür D, Price JL (2000) The organization of networks within the orbital
and medial prefrontal cortex of rats, monkeys and humans. Cereb Cortex
10:206219.
Papageorgiou GK, Sallet J, Wittmann MK, Chau BKH, Schüffelgen U,
Buckley MJ, Rushworth MFS (2017) Inverted activity patterns in ventro-
medial prefrontal cortex during value-guided decision-making in a less-
is-more task. Nat Commun 8:1886.
Platt ML, Glimcher PW (1999) Neural correlates of decision variables in pari-
etal cortex. Nature 400:233238.
Pouget A, Beck JM, Ma WJ, Latham PE (2013) Probabilistic brains: knowns
and unknowns. Nat Neurosci 16:11701178.
Raghuraman AP, Padoa-Schioppa C (2014) Integration of multiple determi-
nants in the neuronal computation of economic values. J Neurosci
34:1158311603.
Rich EL, Wallis JD (2016) Decoding subjective decisions from orbitofrontal
cortex. Nat Neurosci 19:973980.
Roesch MR, Singh T, Brown PL, Mullins SE, Schoenbaum G (2009) Ventral
striatal neurons encode the value of the chosen action in rats decid-
ing between differently delayed or sized rewards. J Neurosci 29:
1336513376.
Yamada et al. ·Neural Dynamics for Expected Value Computation J. Neurosci., February 24, 2021 41(8):16841698 1697
Rudebeck PH, Murray EA (2014) The orbitofrontal oracle: cortical mecha-
nisms for the prediction and evaluation of specific behavioral outcomes.
Neuron 84:11431156.
Samuelson PA (1950) The problem of integrability in utility theory.
Economica 17:355385.
Savage LJ (1954) The foundations of statistics. New York: Wiley.
Stephens D, Krebs J (1986) Foraging theory. Princeton, NJ: Princeton UP.
Sutton RS, Barto AG (1998) Reinforcement learning. Cambridge, MA: MIT.
Takei T, Confais J, Tomatsu S, Oya T, Seki K (2017) Neural basis for hand
muscle synergies in the primate spinal cord. Proc Natl Acad Sci U S A
114:86438648.
Tobler PN, Fiorillo CD, Schultz W (2005) Adaptive coding of reward value
by dopamine neurons. Science 307:16421645.
Tom SM, Fox CR, Trepel C, Poldrack RA (2007) The neural basis of loss
aversion in decision-making under risk. Science 315:515518.
Von Neumann J, Morgenstern O (1944) Theory of games and economic
behavior. Princeton, NJ: Princeton UP.
Wallis JD (2011) Cross-species studies of orbitofrontal cortex and value-
based decision-making. Nat Neurosci 15:1319.
Xie J, Padoa-Schioppa C (2016) Neuronal remapping and circuit persistence
in economic decisions. Nat Neurosci 19:855861.
Yamada H, Matsumoto N, Kimura M (2004) Tonically active neurons in the
primate caudate nucleus and putamen differentially encode instructed
motivational outcomes of action. J Neurosci 24:35003510.
Yamada H, Tymula A, Louie K, Glimcher PW (2013a) Thirst-dependent risk
preferences in monkeys identify a primitive form of wealth. Proc Natl
Acad Sci U S A 110:1578815793.
Yamada H, Inokawa H, Matsumoto N, Ueda Y, Enomoto K, Kimura M
(2013b) Coding of the long-term value of multiple future rewards in the
primate striatum. J Neurophysiol 109:11401151.
Yamada H, Inokawa H, Hori Y, Pan X, Matsuzaki R, Nakamura K, Samejima
K, Shidara M, Kimura M, Sakagami M, Minamimoto T (2016)
Characteristics of fast-spiking neurons in the striatum of behaving mon-
keys. Neurosci Res 105:218.
Yamada H, Louie K, Tymula A, Glimcher PW (2018) Free choice shapes nor-
malized value signals in medial orbitofrontal cortex. Nat Commun 9:162.
Yan C, Su L, Wang Y, Xu T, Yin DZ, Fan MX, Deng CP, Hu Y, Wang ZX,
Cheung EF, Lim KO, Chan RC (2016) Multivariate neural representa-
tions of value during reward anticipation and consummation in the
human orbitofrontal cortex. Sci Rep 6:29079.
Yoo SBM, Hayden BY (2020) The transition from evaluation to selection
involves neural subspace reorganization in core reward regions. Neuron
105:712724.e4.
Zhou J, Gardner MPH, Stalnaker TA, Ramus SJ, Wikenheiser AM, Niv Y,
Schoenbaum G (2019) Rat orbitofrontal ensemble activity contains multi-
plexed but dissociable representations of value and task structure in an
odor sequence task. Curr Biol 29:897907.e3.
1698 J. Neurosci., February 24, 2021 41(8):16841698 Yamada et al. ·Neural Dynamics for Expected Value Computation
... In general, we examine the outcome of our choice and adjust subsequent choice behavior using the outcome information to choose an appropriate action. Five significant studies on neurons (Kawai et al. (2015); Yamada et al. (2021); Imaizumi et al. (2022); Yang et al. (2022); Ferrari-Toniolo and Schultz (2023)) have examined neuronal responses to loss and gain. These studies suggest that two different neural systems may respond to loss and gain, resulting in a value function with a cusp as a reference point. ...
... In their studies on brain reward circuitry, Yamada et al. (2021) and Imaizumi et al. (2022) proposed a neuronal prospect theory model. Using theoretical accuracy equivalent to that of human neuroimaging studies from a gain perspective, they showed that single-neuron activity in four core reward-related cortical and subcortical regions represents a subjective assessment of risky gambles in monkeys. ...
... In the gain context, both monkeys were risk-seekers when the starting token number was low; however, both demonstrated risk-neutral or risk-averse behavior when the start token number increased. This result is consistent with Yamada et al. (2021) and Imaizumi et al. (2022). Moreover, Yang et al. (2022) showed, in monkey G, the utility functions (value functions) in monkey G were consistently steeper for losses than for gains. ...
Article
Full-text available
In prospect theory, the value function is typically concave for gains and convex for losses, with losses usually having a steeper slope than gains. The neural system responds differently to losses and gains. Five new studies on neurons related to this issue have examined neuronal responses to losses, gains, and reference points. This study investigated a new concept of the value function. A value function with a neuronal cusp may exhibit variations and behavioral cusps associated with catastrophic events, potentially influencing a trader's decision to close a position. Additionally, we have conducted empirical studies on algorithmic trading strategies that employ different value function specifications.
... The striatum and the OFC collaboratively encode expected value in economic decision-making in primates. Yamada et al. (2021) [18] showed that the central OFC (cOFC) and the ventral striatum (VS) had distinct roles in representing the expected value of a choice. In a visually cued lottery task, monkeys chose between two pie charts, each indicating a different probability and magnitude of a fluid reward. ...
... The striatum and the OFC collaboratively encode expected value in economic decision-making in primates. Yamada et al. (2021) [18] showed that the central OFC (cOFC) and the ventral striatum (VS) had distinct roles in representing the expected value of a choice. In a visually cued lottery task, monkeys chose between two pie charts, each indicating a different probability and magnitude of a fluid reward. ...
Article
Full-text available
Decision-making is a behavior involving many neuronal processes that is crucial for animals to maximize benefits for survival. The orbitofrontal cortex (OFC) is a brain region known to play critical roles in decision-making in both rodents and primates, and it functions in conjunction with several other brain regions in both species to fulfill decision-making needs. Here we review studies on the specific roles of the OFC and related brain regions in decision-making in rodents and primates, to gain insights on how distinct neural activities in several brain regions functionally contribute to the complex processes required to make a choice. The prefrontal cortex (PFC), the anterior cingulate cortex (ACC), and the striatum work in combination with the OFC to perform the basic processes involved in decision-making, while the basolateral amygdala (BLA) and the hippocampus are implicated together with the OFC in specialized types of decision-making. The specific functions of these brain regions in rodents and primates reveal both preserved and evolved aspects of decision-making mechanisms along the evolutionary lineage.
... The same analyses were then carried out as replications in the second animal. The number of trials and sessions is within the range of previous literature (Yamada, Imaizumi, & Matsumoto, 2021;Padoa-Schioppa & Assad, 2006). Some trials were excluded because the animal failed to make a choice within 5 sec (Monkey D: 599, 1.04%, Monkey C: 1203, 1.8%). ...
... Among these, the OFC is particularly important, as damage or disruption consistently alters value-based choice behavior, suggesting that OFC neurons perform choice-relevant computations (Ballesta, Shi, Conen, & Padoa-Schioppa, 2020;Rudebeck, Saunders, Prescott, Chau, & Murray, 2013). Integrated value signals are commonly found within OFC, including in single-unit firing rates (Padoa-Schioppa & Assad, 2006;Wallis & Miller, 2003;Tremblay & Schultz, 1999), population codes (Yamada et al., 2021;Rich & Wallis, 2016), field potentials (Saez et al., 2018;Rich & Wallis, 2016, 2017, and fMRI BOLD signals (Chikazoe, Lee, Kriegeskorte, & Anderson, 2014;Plassmann, O'Doherty, & Rangel, 2007), and this has been taken as evidence that integrated value is the key decision variable in OFC. However, multiple laboratories consistently report neurons in monkey OFC (primarily Area 13) that encode the value of unique attributes (Pastor-Bernier, Stasiak, & Schultz, 2019;Setogawa et al., 2019;Blanchard, Hayden, & Bromberg-Martin, 2015;Raghuraman & Padoa-Schioppa, 2014;Hosokawa, Kennerley, Sloan, & Wallis, 2013;Padoa-Schioppa & Assad, 2006), and similar signals can be found in human fMRI BOLD (Howard, Gottfried, Tobler, & Kahnt, 2015). ...
Article
Full-text available
In value-based decisions, there are frequently multiple attributes, such as cost, quality, or quantity, that contribute to the overall goodness of an option. Because one option may not be better in all attributes at once, the decision process should include a means of weighing relevant attributes. Most decision-making models solve this problem by computing an integrated value, or utility, for each option from a weighted combination of attributes. However, behavioral anomalies in decision-making, such as context effects, indicate that other attribute-specific computations might be taking place. Here, we tested whether rhesus macaques show evidence of attribute-specific processing in a value-based decision-making task. Monkeys made a series of decisions involving choice options comprising a sweetness and probability attribute. Each attribute was represented by a separate bar with one of two mappings between bar size and the magnitude of the attribute (i.e., bigger = better or bigger = worse). We found that translating across different mappings produced selective impairments in decision-making. Choices were less accurate and preferences were more variable when like attributes differed in mapping, suggesting that preventing monkeys from easily making direct attribute comparisons resulted in less accurate choice behavior. This was not the case when mappings of unalike attributes within the same option were different. Likewise, gaze patterns favored transitions between like attributes over transitions between unalike attributes of the same option, so that like attributes were sampled sequentially to support within-attribute comparisons. Together, these data demonstrate that value-based decisions rely, at least in part, on directly comparing like attributes of multiattribute options.
... In a previous study using a choice task, we showed that amygdala neurons do encode subjective values that reflected integrated reward probability and magnitude when these reward attributes were cued simultaneously 12 . Perhaps neurons in the prefrontal cortex, including the orbitofrontal cortex, and parietal cortex might be relatively more important in signaling to accumulate decision variables derived from sequential or otherwise complex cues 28,[63][64][65][66] . Some previous studies found largely similar coding of values and choices in the amygdala and orbitofrontal cortex 13,67 , while others emphasized differences in the time courses with which neurons in these structures track changing values 53,68 , and in the specificity with which single neurons encode complex, multisensory food rewards 69 . ...
Article
Full-text available
The value of visual stimuli guides learning, decision-making, and motivation. Although stimulus values often depend on multiple attributes, how neurons extract and integrate distinct value components from separate cues remains unclear. Here we recorded the activity of amygdala neurons while two male monkeys viewed sequential cues indicating the probability and magnitude of expected rewards. Amygdala neurons frequently signaled reward probability in an abstract, stimulus-independent code that generalized across cue formats. While some probability-coding neurons were insensitive to magnitude information, signaling ‘pure’ probability rather than value, many neurons showed biphasic responses that signaled probability and magnitude in a dynamic (temporally-patterned) and flexible (reversible) value code. Specific amygdala neurons integrated these reward attributes into risk signals that quantified the variance of expected rewards, distinct from value. Population codes were accurate, mutually transferable between value components, and expressed differently across amygdala nuclei. Our findings identify amygdala neurons as a substrate for the sequential integration of multiple reward attributes into value and risk.
... This aforementioned limitation introduces several potential problems in neuroeconomic studies (Camerer et al., 2005;Glimcher et al., 2008;Yamada et al., 2021;Imaizumi et al., 2022;Tymula et al., 2023) that employ experimental testing of reward valuation systems for economic choices. When measuring neural activity in the reward circuitry, the subjective values of any reward depend on the physical state of the subject (Nakano et al., 1984;Critchley and Rolls, 1996;de Araujo et al., 2006;Pritchard et al., 2008), even for money (Symmonds et al., 2010). ...
Article
Full-text available
Hunger and thirst drive animals’ consumption behavior and regulate their decision-making concerning rewards. We previously assessed the thirst states of monkeys by measuring blood osmolality under controlled water access and examined how these thirst states influenced their risk-taking behavior in decisions involving fluid rewards. However, hunger assessment in monkeys remains poorly performed. Moreover, the lack of precise measures for hunger states leads to another issue regarding how hunger and thirst states interact with each other in each individual. Thus, when controlling food access to motivate performance, it remains unclear how these two physiological needs are satisfied in captive monkeys. Here, we measured blood ghrelin and osmolality levels to respectively assess hunger and thirst in four captive macaques. Using an enzyme-linked immunosorbent assay, we identified that the levels of blood ghrelin, a widely measured hunger-related peptide hormone in humans, were high after 20 h of no food access (with ad libitum water). This reflects a typical controlled food access condition. One hour after consuming a regular dry meal, the blood ghrelin levels in three out of four monkeys decreased to within their baseline range. Additionally, blood osmolality measured from the same blood sample, the standard hematological index of hydration status, increased after consuming the regular dry meal with no water access. Thus, ghrelin and osmolality may reflect the physiological states of individual monkeys regarding hunger and thirst, suggesting that these indices can be used as tools for monitoring hunger and thirst levels that mediate an animal's decision to consume rewards.
... For example, multiple types of voltage-gated ion channels in Purkinje cells make complex spikes in neurons [4]. Recent research suggests that neurons perform complex computation such as detecting synchronized inputs [5], selecting inputs [6] and calculating expected values [7] by applying nonlinear spatiotemporal transformations to synaptic inputs in dendrites. Therefore, it is important to estimate spatial electrical properties of neurons for understanding information processing in the brain. ...
Article
Full-text available
One of the neuron models that simulate the electrical activity of neurons, the multi-compartment model, has spatial electrical properties that control nonlinear spatiotemporal dynamics and can reproduce nonlinear electrical responses with high accuracy. However, it is difficult to determine the model parameters in multi-compartment models from membrane potentials, since unknown high dimensional parameters for spatial electrical property should be estimated using incomplete observation data. In this paper, we propose a data-driven method to estimate the spatial electrical properties in the multi-compartment model from membrane potentials observed incompletely. The proposed method employs the replica exchange method using prior information considering morphological smoothness to solve problems of the local optima in the solution space and incompleteness of observation data. We further verify the effectiveness of the proposed method by using simulation data obtained from realistic neuron models.
... Then, according to the above equation (14), the P values of image skeleton line fitting for all subjects were calculated to be the maximum 99.72% and the minimum 99.35%, respectively. The expected probability [40,41] E P =99.61% is obtained by calculating the average value. The 3D skeleton extraction accuracy pairs under similar conditions are shown in Table 3. Table 3 Comparison of accuracy of 3D image skeleton extraction ID references methods accuracy 1 Lebre et al.,2018 [42] 3D erosion algorithm 0.97 2 Merveille et al.,2017 [43] 3D erosion algorithm 0.90 ...
Article
Full-text available
INTRODUCTION: Analysis of magnetic resonance angiography image data is crucial for early detection and prevention of stroke patients. Extracting the 3D Skeleton of cerebral vessels is the focus and difficulty of analysis.OBJECTIVES: The objective is to remove other tissue components from the vascular tissue portion of the image with minimal loss by reading MRA image data and performing processing processes such as grayscale normalization, interpolation, breakpoint detection and repair, and image segmentation to facilitate 3D reconstruction of cerebral blood vessels and the reconstructed vascular tissues make extraction of the Skeleton easier.METHODS: Considering that most of the existing techniques for extracting the 3D vascular Skeleton are corrosion algorithms, machine learning algorithms require high hardware resources, a large number of learning and test cases, and the accuracy needs to be confirmed, an average plane center of mass computation method is proposed, which improves the average plane algorithm by combining the standard plane algorithm and the center of mass algorithm.RESULTS: Intersection points and skeleton breakpoints on the Skeleton are selected as critical points and manually labeled for experimental verification, and the algorithm has higher efficiency and accuracy than other algorithms in directly extracting the 3D Skeleton of blood vessels.CONCLUSION: The method has low hardware requirements, accurate and reliable image data, can be automatically modeled and calculated by Python program, and meets the needs of clinical applications under information technology conditions.
... Among these, the orbitofrontal cortex (OFC) is particularly important, as damage or disruption consistently alters value-based choice behavior, suggesting that OFC neurons perform choice-relevant computations (47,48). Integrated value signals are commonly found within OFC, including in single unit firing rates (7-9), population codes (49,50), field potentials (50-52), and fMRI BOLD signals (53,54), and this has been taken as evidence that integrated value is the key decision variable in OFC. However, multiple labs consistently report neurons in monkey OFC (primarily area 13) that encode the value of unique attributes (7,(55)(56)(57)(58)(59), and similar signals can be found in human fMRI BOLD (60). ...
Preprint
In value-based decisions, there are frequently multiple attributes, such as cost, quality, or quantity, that contribute to the overall goodness of an option. Since one option may not be better in all attributes at once, the decision process should include a means of weighing relevant attributes. Most decision-making models solve this problem by computing an integrated value, or utility, for each option from a weighted combination of attributes. However, behavioral anomalies in decision-making, such as context effects, indicate that other attribute-specific computations might be taking place. Here, we tested whether rhesus macaques show evidence of attribute-specific processing in a value-based decision-making task. Monkeys made a series of decisions involving choice options comprising a sweetness and probability attribute. Each attribute was represented by a separate bar with one of two mappings between bar size and the magnitude of the attribute (i.e., bigger=better or bigger=worse). We found that translating across different mappings produced selective impairments in decision-making. When like attributes differed, monkeys were prevented from easily making direct attribute comparisons, and choices were less accurate and preferences were more variable. This was not the case when mappings of unalike attributes within the same option were different. Likewise, gaze patterns favored transitions between like attributes over transitions between unalike attributes of the same option, so that like attributes were sampled sequentially to support within-attribute comparisons. Together, these data demonstrate that value-based decisions rely, at least in part, on directly comparing like attributes of multi-attribute options. Significance Statement Value-based decision-making is a cognitive function impacted by a number of clinical conditions, including substance use disorder and mood disorders. Understanding the neural mechanisms, including online processing steps involved in decision formation, will provide critical insights into decision-making deficits characteristic of human psychiatric disorders. Using rhesus monkeys as a model species capable of complex decision-making, this study shows that decisions involve a process of comparing like features, or attributes, of multi-attribute options. This is contrary to popular models of decision-making in which attributes are first combined into an overall value, or utility, to make a choice. Therefore, these results serve as an important foundation for establishing a more complete understanding of the neural mechanisms involved in forming complex decisions.
Article
Full-text available
Neural dynamics are thought to reflect computations that relay and transform information in the brain. Previous studies have identified the neural population dynamics in many individual brain regions as a trajectory geometry, preserving a common computational motif. However, whether these populations share particular geometric patterns across brain-wide neural populations remains unclear. Here, by mapping neural dynamics widely across temporal/frontal/limbic regions in the cortical and subcortical structures of monkeys, we show that 10 neural populations, including 2,500 neurons, propagate visual item information in a stochastic manner. We found that visual inputs predominantly evoked rotational dynamics in the higher-order visual area, TE, and its downstream striatum tail, while curvy/straight dynamics appeared frequently downstream in the orbitofrontal/hippocampal network. These geometric changes were not deterministic but rather stochastic according to their respective emergence rates. Our meta-analysis results indicate that visual information propagates as a heterogeneous mixture of stochastic neural population signals in the brain.
Article
Full-text available
An animal’s choice behavior is shaped by the outcome feedback from selected actions in a trial-and-error approach. Tonically active neurons (TANs), presumed cholinergic interneurons in the striatum, are thought to be involved in the learning and performance of reward-directed behaviors, but it remains unclear how TANs are involved in shaping reward-directed choice behaviors based on the outcome feedback. To this end, we recorded activity of TANs from the dorsal striatum of two macaque monkeys (Macaca fuscata; 1 male, 1 female) while they performed a multi-step choice task to obtain multiple rewards. In this task, the monkeys first searched for a rewarding target from among three alternatives in a trial-and-error manner and then earned additional rewards by repeatedly choosing the rewarded target. We found that a considerable proportion of TANs selectively responded to either the reward or the no-reward outcome feedback during the trial-and-error search, but these feedback responses were not observed during repeat trials. Moreover, the feedback responses of TANs were similarly observed in any search trials, without distinctions regarding the predicted probability of rewards and the location of chosen targets. Unambiguously, TANs detected reward and no-reward feedback specifically when the monkeys performed trial-and-error searches, in which the monkeys were learning the value of the targets and adjusting their subsequent choice behavior based on the reward and no-reward feedback. These results suggest that striatal cholinergic interneurons signal outcome feedback specifically during search behavior, in circumstances where the choice outcomes cannot be predicted with certainty by the animals.
Article
Full-text available
Individual neurons in many cortical regions have been found to encode specific, identifiable features of the environment or body that pertain to the function of the region1,2,3. However, in frontal cortex, which is involved in cognition, neural responses display baffling complexity, carrying seemingly disordered mixtures of sensory, motor and other task-related variables4,5,6,7,8,9,10,11,12,13. This complexity has led to the suggestion that representations in individual frontal neurons are randomly mixed and can only be understood at the neural population level14,15. Here we show that neural activity in rat orbitofrontal cortex (OFC) is instead highly structured: single neuron activity co-varies with individual variables in computational models that explain choice behaviour. To characterize neural responses across a large behavioural space, we trained rats on a behavioural task that combines perceptual and value-guided decisions. An unbiased, model-free clustering analysis identified distinct groups of OFC neurons, each with a particular response profile in task-variable space. Applying a simple model of choice behaviour to these categorical response profiles revealed that each profile quantitatively corresponds to a specific decision variable, such as decision confidence. Additionally, we demonstrate that a connectivity-defined cell type, orbitofrontal neurons projecting to the striatum, carries a selective and temporally sustained representation of a single decision variable: integrated value. We propose that neurons in frontal cortex, as in other cortical regions, form a sparse and overcomplete representation of features relevant to the region’s function, and that they distribute this information selectively to downstream regions to support behaviour.
Article
Full-text available
The neural mechanisms mediating sensory-guided decision-making have received considerable attention, but animals often pursue behaviors for which there is currently no sensory evidence. Such behaviors are guided by internal representations of choice values that have to be maintained even when these choices are unavailable. We investigated how four macaque monkeys maintained representations of the value of counterfactual choices—choices that could not be taken at the current moment but which could be taken in the future. Using functional magnetic resonance imaging, we found two different patterns of activity co-varying with values of counterfactual choices in a circuit spanning the hippocampus, the anterior lateral prefrontal cortex and the anterior cingulate cortex. Anterior cingulate cortex activity also reflected whether the internal value representations would be translated into actual behavioral change. To establish the causal importance of the anterior cingulate cortex for this translation process, we used a novel technique, transcranial focused ultrasound stimulation, to reversibly disrupt anterior cingulate cortex activity.
Article
Full-text available
Non-human primate neuroimaging is a rapidly growing area of research that promises to transform and scale translational and cross-species comparative neuroscience. Unfortunately, the technological and methodological advances of the past two decades have outpaced the accrual of data, which is particularly challenging given the relatively few centers that have the necessary facilities and capabilities. The PRIMatE Data Exchange (PRIME-DE) addresses this challenge by aggregating independently acquired non-human primate magnetic resonance imaging (MRI) datasets and openly sharing them via the International Neuroimaging Data-sharing Initiative (INDI). Here, we present the rationale, design, and procedures for the PRIME-DE consortium, as well as the initial release, consisting of 25 independent data collections aggregated across 22 sites (total = 217 non-human primates). We also outline the unique pitfalls and challenges that should be considered in the analysis of non-human primate MRI datasets, including providing automated quality assessment of the contributed datasets.
Article
Full-text available
Normalization is a common cortical computation widely observed in sensory perception, but its importance in perception of reward value and decision making remains largely unknown. We examined (1) whether normalized value signals occur in the orbitofrontal cortex (OFC) and (2) whether changes in behavioral task context influence the normalized representation of value. We record medial OFC (mOFC) single neuron activity in awake-behaving monkeys during a reward-guided lottery task. mOFC neurons signal the relative values of options via a divisive normalization function when animals freely choose between alternatives. The normalization model, however, performed poorly in a variant of the task where only one of the two possible choice options yields a reward and the other was certain not to yield a reward (so called: "forced choice"). The existence of such context-specific value normalization may suggest that the mOFC contributes valuation signals critical for economic decision making when meaningful alternative options are available.
Article
Full-text available
Ventromedial prefrontal cortex has been linked to choice evaluation and decision-making in humans but understanding the role it plays is complicated by the fact that little is known about the corresponding area of the macaque brain. We recorded activity in macaques using functional magnetic resonance imaging during two very different value-guided decision-making tasks. In both cases ventromedial prefrontal cortex activity reflected subjective choice values during decision-making just as in humans but the relationship between the blood oxygen level-dependent signal and both decision-making and choice value was inverted and opposite to the relationship seen in humans. In order to test whether the ventromedial prefrontal cortex activity related to choice values is important for decision-making we conducted an additional lesion experiment; lesions that included the same ventromedial prefrontal cortex region disrupted normal subjective evaluation of choices during decision-making.
Article
Economic choice proceeds from evaluation, in which we contemplate options, to selection, in which we weigh options and choose one. These stages must be differentiated so that decision makers do not proceed to selection before evaluation is complete. We examined responses of neurons in two core reward regions, orbitofrontal (OFC) and ventromedial prefrontal cortex (vmPFC), during two-option choice with asynchronous offer presentation. Our data suggest that neurons selective during the first (presumed evaluation) and second (presumed comparison and selection) offer epochs come from a single pool. Stage transition is accompanied by a shift toward orthogonality in the low-dimensional population response manifold. Nonetheless, the relative position of each option in driving responses in the population subspace is preserved. The orthogonalization we observe supports the hypothesis that the transition from evaluation to selection leads to reorganization of response subspace and suggests a mechanism by which value-related signals are prevented from prematurely driving choice.
Article
Neural correlates implicate the orbitofrontal cortex (OFC) in value-based or economic decision making [1-3]. Yet inactivation of OFC in rats performing a rodent version of the standard economic choice task is without effect [4, 5], a finding more in accord with ideas that the OFC is primarily necessary for behavior when new information must be taken into account [6-9]. Neural activity in the OFC spontaneously updates to reflect new information, particularly about outcomes [10-16], and the OFC is necessary for adjustments to learned behavior only under these conditions [4, 16-26]. Here, we merge these two independent lines of research by inactivating lateral OFC during an economic choice that requires new information about the value of the predicted outcomes to be incorporated into an already established choice. Outcome value was changed by pre-feeding the rats one of two food options before testing. In control rats, this pre-feeding resulted in divergent changes in choice behavior that depended on the rats' prior preference for the pre-fed food. Optogenetic inactivation of the OFC disrupted this bi-directional effect of pre-feeding without affecting other measures that describe the underlying choice behavior. This finding unifies the role of the OFC in economic choice with its role in a host of other behaviors, causally demonstrating that the OFC is not necessary for economic choice per se-unless that choice incorporates new information about the outcomes.
Article
The orbitofrontal cortex (OFC) has long been implicated in signaling information about expected outcomes to facilitate adaptive or flexible behavior. Current proposals focus on signaling of expected value versus the representation of a value-agnostic cognitive map of the task. While often suggested as mutually exclusive, these alternatives may represent extreme ends of a continuum determined by task complexity and experience. As learning proceeds, an initial, detailed cognitive map might be acquired, based largely on external information. With more experience, this hypothesized map can then be tailored to include relevant abstract hidden cognitive constructs. The map would default to an expected value in situations where other attributes are largely irrelevant, but, in richer tasks, a more detailed structure might continue to be represented, at least where relevant to behavior. Here, we examined this by recording single-unit activity from the OFC in rats navigating an odor sequence task analogous to a spatial maze. The odor sequences provided a mappable state space, with 24 unique “positions” defined by sensory information, likelihood of reward, or both. Consistent with the hypothesis that the OFC represents a cognitive map tailored to the subjects’ intentions or plans, we found a close correspondence between how subjects were using the sequences and the neural representations of the sequences in OFC ensembles. Multiplexed with this value-invariant representation of the task, we also found a representation of the expected value at each location. Thus, the value and task structure co-existed as dissociable components of the neural code in OFC.