ArticlePDF Available

A model of naturalistic decision making in preference tests

Authors:

Abstract and Figures

Decisions as to whether to continue with an ongoing activity or to switch to an alternative are a constant in an animal's natural world, and in particular underlie foraging behavior and performance in food preference tests. Stimuli experienced by the animal both impact the choice and are themselves impacted by the choice, in a dynamic back and forth. Here, we present model neural circuits, based on spiking neurons, in which the choice to switch away from ongoing behavior instantiates this back and forth, arising as a state transition in neural activity. We analyze two classes of circuit, which differ in whether state transitions result from a loss of hedonic input from the stimulus (an "entice to stay" model) or from aversive stimulus-input (a "repel to leave" model). In both classes of model, we find that the mean time spent sampling a stimulus decreases with increasing value of the alternative stimulus, a fact that we linked to the inclusion of depressing synapses in our model. The competitive interaction is much greater in "entice to stay" model networks, which has qualitative features of the marginal value theorem, and thereby provides a framework for optimal foraging behavior. We offer suggestions as to how our models could be discriminatively tested through the analysis of electrophysiological and behavioral data.
Content may be subject to copyright.
RESEARCH ARTICLE
A model of naturalistic decision making in
preference tests
John KsanderID
1,2
, Donald B. KatzID
1,2
, Paul MillerID
1,3
*
1Volen National Center for Complex Systems, Brandeis University, Waltham, Massachusetts, United States
of America, 2Department of Psychology, Brandeis University, Waltham, Massachusetts, United States of
America, 3Department of Biology, Brandeis University, Waltham, Massachusetts, United States of America
*pmiller@brandeis.edu
Abstract
Decisions as to whether to continue with an ongoing activity or to switch to an alternative are
a constant in an animal’s natural world, and in particular underlie foraging behavior and per-
formance in food preference tests. Stimuli experienced by the animal both impact the choice
and are themselves impacted by the choice, in a dynamic back and forth. Here, we present
model neural circuits, based on spiking neurons, in which the choice to switch away from
ongoing behavior instantiates this back and forth, arising as a state transition in neural activ-
ity. We analyze two classes of circuit, which differ in whether state transitions result from a
loss of hedonic input from the stimulus (an “entice to stay” model) or from aversive stimulus-
input (a “repel to leave” model). In both classes of model, we find that the mean time spent
sampling a stimulus decreases with increasing value of the alternative stimulus, a fact that
we linked to the inclusion of depressing synapses in our model. The competitive interaction
is much greater in “entice to stay” model networks, which has qualitative features of the mar-
ginal value theorem, and thereby provides a framework for optimal foraging behavior. We
offer suggestions as to how our models could be discriminatively tested through the analysis
of electrophysiological and behavioral data.
Author summary
Many decisions are of the ilk of whether to continue sampling a stimulus or to switch to
an alternative, a key feature of foraging behavior. We produce two classes of model for
such stay-switch decisions, which differ in how decisions to switch stimuli can arise. In an
“entice-to-stay” model, a reduction in the necessary positive stimulus input causes switch-
ing decisions. In a “repel-to-leave” model, a rise in aversive stimulus input produces a
switch decision. We find that in tasks where the sampling of one stimulus follows another,
adaptive biological processes arising from a highly hedonic stimulus can reduce the time
spent at the following stimulus, by up to ten-fold in the “entice-to-stay” models. Along
with potentially observable behavioral differences that could distinguish the classes of net-
works, we also found signatures in neural activity, such as oscillation of neural firing rates
and a rapid change in rates preceding the time of choice to leave a stimulus. In summary,
PLOS COMPUTATIONAL BIOLOGY
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 1 / 25
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Ksander J, Katz DB, Miller P (2021) A
model of naturalistic decision making in preference
tests. PLoS Comput Biol 17(9): e1009012. https://
doi.org/10.1371/journal.pcbi.1009012
Editor: Alireza Soltani, Dartmouth College, UNITED
STATES
Received: April 19, 2021
Accepted: September 10, 2021
Published: September 23, 2021
Peer Review History: PLOS recognizes the
benefits of transparency in the peer review
process; therefore, we enable the publication of
all of the content of peer review and author
responses alongside final, published articles. The
editorial history of this article is available here:
https://doi.org/10.1371/journal.pcbi.1009012
Copyright: ©2021 Ksander et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: Data are available at
https://github.com/johnksander/naturalistic-
decision-making.
Funding: PM and DBK received funding and salary
coverage from NIH and the BRAIN Initiative via
R01 NS104818. JK received salary coverage from
our model findings lead to testable predictions and suggest a neural circuit-based frame-
work for explaining foraging choices.
Introduction
Decisions as to whether to stay in a current situation or switch to a different one face all ani-
mals continuously. Such “should I stay, or should I go now” decision-making has been studied
most in the context of foraging [14], in which an animal decides whether to continue to stay
at a current source of food or to seek an alternative source. In foraging studies, the current
source of food necessarily yields diminishing returns, such that at some point in time it is opti-
mal for the animal to seek a higher-quality option. Furthermore, this tendency appears “baked
in” to animal behavior: while food sources in laboratory food preference tests remain in con-
stant supply, animals nonetheless typically switch back and forth between two or more
alternatives.
Here, we simulate a model of such switching back and forth in terms of transitions between
quasi-stable attractor states of neural activity. Quasi-stable attractor states [5] are patterns of
neural activity that are essentially self-sustaining but limited by adaptation processes or fluctu-
ations that eventually lead to a loss of stability and a transition to a new pattern of activity [6].
Evidence for such attractor states is most abundant in neural activity arising from perception
[7,8], with quasi-stability most apparent in the switching between bistable percepts. More
recently, experimental evidence for transitions between attractor states has been found [9,10]
and modeled [1113] in perceptual decision-making tasks. For those tasks, the attractor states
have been proposed to represent one of two possible percepts for a given stimulus [14], or the
absence of a decision (i.e., an undecided state)[11,13]. In the more naturalistic decision-mak-
ing model we present here, two activity states represent the ongoing choice to either stay with
a current stimulus or switch to a new stimulus [4]. It is revealing that, according to behavioral
data [15], the distribution of bout durations, i. e., when an animal stays at a stimulus (corre-
sponding to the durations of the “stay” state in our model) is approximately exponential,
which is a hallmark of noise-induced transitions between discrete states [16].
Our simulations differ from those aimed to model the circuitry underlying perceptual deci-
sion making in a second way. In preference tests the stimuli are separated in time, rather than
via distinct neural pathways. For example, two tastes being compared could simply be different
concentrations of sodium chloride concentration, which yields an inverted-U palatability
response [17]. The two stimuli excite identical neurons during successive sampling bouts, with
neurons indicating high palatability during one bout being the same as those indicating high
palatability during the next bout. In this manner, a preference task is more akin to decision
making tasks requiring a sequential comparison of a single parametric quantity [1820],
although in this paper the quantity being compared is palatability and the decision is an ongo-
ing one of how long to stay at the stimulus.
Our work compares performance between two model classes, distinguished by their param-
eter settings. In one class, the system is inherently “fickle” in that the “stay” state is unstable; an
animal operating by this model would stay for only a very short duration of time unless neural
activity is stabilized by input indicative of a highly positive hedonic stimulus. One could say in
these networks that a delicious stimulus entices the animal to stay, but otherwise the animal
leaves. In the other class of model, the system is “committed” in the sense that the “stay” state
is highly stable, such that an animal operating by this model would stay for relatively long
durations, until input suggesting stimulus aversiveness causes a transition away from that
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 2 / 25
the Swartz Foundation #2020-13. Computational
resources were provided by the Brandeis HPCC
which is partially supported by the Brandeis Center
for Bioinspired Soft Materials, and NSF MRSEC,
DMR-2011846. The funders had no role in study
design, data collection and analysis, decision to
publish, or preparation of the manuscript.
Competing interests: The authors have declared
that no competing interests exist.
state. One could say in these networks that the stimulus causes the animal to leave, but that
otherwise the animal stays.
We produce multiple versions of each class of model, in order to assess how they could be
reliably distinguished in behavioral or electrophysiological data [4,21,22]. Our simulations
suggest that any observation of strongly reduced sampling-bout durations at one stimulus fol-
lowing the sampling of a highly hedonic stimulus would indicate animals operating with the
“entice to stay” class of network. Also, in the “entice to stay” class of network, neural activity
appears to change gradually over more than 100ms preceding a choice to leave a stimulus,
whereas in the “repel to leave” network, the activity change is more sudden, but preceded by
oscillations in the beta range. Therefore, we suggest that choice-aligned averaging of neural
spike trains, combined with appropriate analysis of animal behavior in preference tasks [23]
can distinguish the two classes of models.
Results
Network characteristics
Our initial goal was to set up a network with two distinct activity states, one representing an
animal continuing to sample a stimulus, or “staying”, the other representing the choice to
“switch” elsewhere. Fig 1 shows the behavior of such a network in the absence of stimulus (i.e.
when the network receives only noisy background input). The raster plot in Fig 1B illustrates
how the network’s activity abruptly shifts back-and-forth between two states of activity, which
are quasi-stable attractor states. When the excitatory neurons in the “stay” population become
highly active, those in the “switch” population become silent, and vice versa. Fig 1D shows
these same data in terms of each population’s averaging spike rates. Noisy fluctuations in the
spiking activity drive these transitions between stay and switch states. While Fig 1 indicates the
quasi-stable nature of the circuit in the absence of any stimulus, when we include a stimulus in
the circuit simulation, activation of the “switch” population represents the choice to end a
bout of sampling.
Our model incorporates synaptic depression, whereby a rapid series of neural spikes
depletes the supply of synaptic vesicles and therefore reduces synaptic efficacy. A docking pro-
cess, which renders available vesicles release-ready is relatively fast (a few hundred ms), so we
denote the fraction of docked vesicles as D
fast
. Recycling and filling of vesicles to regenerate the
activity-depleted pool is slower (many seconds), however, so we denote the fraction available
for docking as D
slow
.Fig 1C–1E show how the population-averages of these depression vari-
ables change as the model switches between activity states shown in Fig 1B–1D. Specifically,
Fig 1C illustrates how docking sites release their vesicles during sustained spiking, and then
quickly become replenished from the reserve pool after activity ceases. Fig 1E illustrates how
activity empties reserve vesicle pools, as the vesicles stored there replenish empty docking sites.
The reserve vesicle pools then slowly recoup their losses after the cells stop firing and vesicle
regeneration outpaces the losses to docking sites. Note that reserve vesicle pools sometimes fail
to refill fully before the next state transition. A long active state will deplete the reserve vesicle
pool to a degree that those reserves cannot fully recover from without a lengthy inactive
period.
Parameter sweep and example networks
In order to ensure that any results of our study are robust to parameter variations, we gener-
ated a number of distinct “entice to stay” and “repel to leave” networks before assessing quali-
tative differences between the two classes’ behavior. To this end, we first measured the mean
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 3 / 25
durations of “stay” states in multiple networks generated across a five-fold range of excitatory-
to-inhibitory (E-to-I) and inhibitory-to-excitatory (I-to-E) connection strengths.
Fig 2 indicates the range of parameters for which we found state-switching behavior, with
mean duration of the “stay” state indicated in color in Fig 2A and 2B. We chose five networks
of each of the two classes, with “entice to stay” networks having the shortest intrinsic “stay”
state durations of under two seconds (solid symbols in Fig 2B) and “repel to leave” networks
having the longest intrinsic “stay” state durations of over 100 seconds (open symbols in Fig
2B). We typically paired networks (same symbol shape in Fig 2B) with either the same E-to-I
connection strength or the same I-to-E connection strength. Across the range of parameters,
firing rates of excitatory and inhibitory neurons were at levels compatible with those of cortical
neurons in an active state (near 10 Hz for excitatory cells and up to 60 Hz for inhibitory cells,
S1 Fig). These results demonstrate that our circuit produces quasi-bistable state-switching
behavior with realistic spike-rates and that this behavior is robust to variation in synaptic con-
nection strengths (see also S2 Fig).
Fig 1. Network in a default mode without stimulus possesses two quasi-stable attractor states. Example data from a
fast (“entice to stay”) network in the absence of stimulus. (A) Circuit diagram, indicating cross-inhibition between
excitatory populations of cells labeled E, inhibited by their corresponding inhibitory population labeled I. (B) Spike
rasters from model neurons arranged in rows grouped by their corresponding populations and color-coded as in (A).
Each dot represents one spike from the model neuron. (C) Mean firing rate of each population as a function of time.
(D) Dynamics of the fast-depression variable. (E) Dynamics of the slow-depression variable. (C-E) Line color indicates
which population’s average variable is plotted with the color code as in (A).
https://doi.org/10.1371/journal.pcbi.1009012.g001
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 4 / 25
We next performed simulations with all ten example networks (symbols in Fig 2, panel B).
Our results generalized well across these parameterizations, so for simplicity we first describe
data arising from one pair of networks from each class–specifically, from the networks with
square markers in Fig 2B. In the supplementary materials (S3 and S4 Figs) we detail the simu-
lation results from all ten networks in cases where they are not all included in the main text,
for completeness.
Since, by definition, sampling bouts are in the presence of a stimulus, and their durations
are a key behavioral measurement, our first goal was to find “baseline” stimulus values that
reproduced similar mean durations of the “stay state” for each network. We therefore defined
a neutral stimulus for each network, as one which produces mean “stay state” durations
between eight and nine seconds, in the middle of the observed ranges of typical bout durations
for a rat licking taste stimuli from a spout [15,24].
Fig 3 illustrates how application of the neutral stimulus to intrinsically fast-switching net-
works via excitation of neurons driving the “stay” state (upper left) realizes the “entice to stay”
Fig 2. Intrinsic stability of the “stay” state varies with connection strengths. Average duration of the “stay” state in distinct networks with different synaptic
connection strengths, without stimulus. Symbols indicate parameters of the five example networks, with solid symbols representing fast-switching “entice-to-stay”
networks and open symbols representing slow-switching “repel to leave” networks. Identical shapes indicate the pairs used for comparison.
https://doi.org/10.1371/journal.pcbi.1009012.g002
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 5 / 25
operation of a stimulus. The stimulus slows the network’s transitions from stay to leave from its
stimulus-free duration of under two seconds such that the network achieves the desired mean
bout duration (lower left). Conversely, application of the neutral stimulus to intrinsically slow-
switching networks via excitation of neurons driving the “switch” state (upper right) realizes the
“repel to leave” operation of a stimulus. In these networks, the stimulus quickens the network
transition such that the network’s stimulus-free mean durations in the “stay” state of a few min-
utes are shortened to the desired mean bout duration (lower right). In these distinct manners, the
same mean behavior to a given, neutral, external stimulus can arise from either class of network.
We do not simulate the behavior of the animal between bouts of tasting, but once the activ-
ity state indicates a switch of stimuli, we remove all stimulus input in the model to indicate
movement away from the food source. We subsequently reduce input to the excitatory
“switch”-producing cells to indicate completion of the movement and ensure commencement
of the next sampling bout with the “stay” state. In the following sections the stimulus during
the subsequent bout corresponds to the alternative tastant, which in many cases has a different
palatability so produces a spike train of a different rate, as indicated in Fig 4.
Fig 3. In response to a taste stimulus, neural activity in the two classes of network can lead to equal mean sampling durations. Left:
Representing a neutral taste stimulus via excitatory input to “stay”-promoting neurons in a fast (entice-to-stay) network slows state
transitions. Right: Whereas representing a neutral taste stimulus via excitatory input to “switch”-promoting neuronsin a slow (repel-to-
leave) network quickens state transitions to produce the same switching dynamics.
https://doi.org/10.1371/journal.pcbi.1009012.g003
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 6 / 25
Taste preference behavior
In taste preference tests, consuming one stimulus comes at the expense of another, primarily
because the animal has limited desire and time for consuming food or liquid. Therefore, as in
any time-limited task, there is an inherent competition in that more time spent doing A means
less time spent doing B. However, a key unanswered question is whether the nature of one
taste stimulus impacts the neural and behavioral responses to a subsequent, second taste stim-
ulus. Association with a more palatable stimulus via temporal proximity could, in principle,
enhance the perceived palatability of an alternative stimulus. However, the qualitative results
of foraging studies as encoded in the Marginal Value Theorem, suggest the reverse is true.
Fig 4. Example of a taste preference task simulation. Animals alternate between a less palatable stimulus A and a more palatable stimulus B. In the intrinsically fast-
switching “entice-to-stay” network, the increased palatability arises from greater excitatory input to the excitatory “stay” pool of neurons, whereas in the intrinsically slow-
switching “repel-to-leave” network, the increase in palatability is produced by a reduction of excitatory input to the excitatory “switch” pool of neurons.
https://doi.org/10.1371/journal.pcbi.1009012.g004
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 7 / 25
That is, during foraging, animals spend less time at a food source if there are better sources eas-
ily reached elsewhere. So, in a food preference task it is possible that higher stimulus palatabil-
ity of one food source leads to a shortening of bout durations at the alternative food source.
Therefore, we assessed whether such bout-to-bout competition between stimuli would arise in
our networks.
Fig 4 illustrates how we simulate the taste preference task, such that the model switches
between stimuli of different palatability (see Taste preference task simulations in the Methods
section for full details). A key distinction between the two classes of models is apparent even
before examining their performance, in the different representations of stimulus palatability.
When testing the intrinsically fast-switching “entice-to-stay” networks, we simulate a more
palatable tastant by increasing the excitatory input to excitatory neurons representing the
“stay” activity state; this increases “stay” state durations. When testing the intrinsically slow-
switching “repel-to-leave” networks, meanwhile, we simulate that same tastant by reducing the
excitatory input to excitatory neurons representing the “switch” activity state; this also
increases “stay” state durations. We identify the time spent in the “stay” state as the duration of
active bouts of food sampling, and therefore proportional to the amount of food consumed
(the basic behavioral measure of palatability). In these distinct manners, both classes of net-
work represent taste stimuli of varying palatability via varying afferent firing rates.
To assess the impact of one stimulus on the other, we measured the mean bout durations
(our measure of stimulus palatability) of stimulus A as a function of the mean bout duration of
stimulus B. For these simulations, we fixed one of a pair of stimuli (stimulus A, baseline) and
varied the palatability of the second stimulus (stimulus B, varied).
The results of these analyses, which are shown in Fig 5, differentiate network type. For both
classes of network, competition between bout durations arises—with increased palatability
and bout duration of the varied stimulus B, the bout durations of the fixed stimulus A decrease.
For the intrinsically slow switching “repel-to-leave” networks (red points in Fig 5), the compe-
tition can produce a factor of two reduction in the sampling bout durations at the unchanged
stimulus A. However, the competition has a much larger impact on the intrinsically fast
switching “entice-to-stay” networks (blue points in Fig 5), such that sampling bout durations
at the fixed stimulus A can decrease by up to ten-fold over the same range of B variation.
To explain this more strongly competitive interaction between successive stimuli in the
“entice-to-stay” network (Fig 5, lower panels, solid curves), it is worth considering differences
in how the relative inputs arising from hedonic stimuli impact these networks. In the “entice-
to-stay” network, a positively hedonic stimulus B is simulated in terms of large amounts of
input to the “stay” neurons in the network. These neurons receive strong input and fire at high
rates as they represent an animal sampling the positively hedonic stimulus for long durations.
A long duration of relatively high firing rates, meanwhile, is exactly the network activity that is
expected to deplete the reserve vesicle-pool and maximize adaptation at the circuit-level. Note
that the converse does not apply–that with a highly aversive stimulus B, and lower firing rates
of the “stay” neurons, the sampling durations in response to stimulus A do not increase
significantly.
In the “repel-to-leave” network, meanwhile, the coincidence of extended duration higher
than baseline firing rate never arises. The increased hedonic value of B is instead achieved via
reduced input to the “leave” neurons in the network, such that long sampling of a positively
hedonic stimulus causes a reduction of synaptic depression in the circuit. One might expect
that synaptic depression therefore induces competition in the opposite case—when stimulus B
is highly unpalatable, the input to the “leave” neurons, and therefore their firing rates, are high
—but in such a scenario the bout durations are too short for the reserve vesicle-pool to be
depleted significantly.
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 8 / 25
Fig 5. Mean duration of sampling bouts in simulated taste preference tests dependson the alternative stimulus in a competitive manner.
(Top) Behavior of intrinsically fast-switching (“entice-to-stay”) networks is shown as blue points, and intrinsically slow-switching “repel-to-
leave” networks as red points. In both networks the greater the palatability of stimulus B, the lower the perceived palatability of the unchanging
stimulus A, but the bout-to-bout competition is much stronger in the “entice-to-stay” network when stimulus B is highly palatable. (Bottom)
Mean durations of the “stay” state are plotted as a function of the ratio of the inputs corresponding to the two stimuli, A and B. Each pair of a
values for a given ratio provides one data point in the top panel. The input for stimulus A is held fixed while the input for stimulus B is varied as a
parameter to produce distinct values along the x-axis. The two classes of network are distinguished in whether the input excites the “stay” pool
(entice-to-stay network, lower left) or the “leave” pool (repel-to-leave network, lower right) and whether, respectively, high input or low input for
stimulus B reflects the most hedonic stimulus B. Competition is revealed as the duration of responses to stimulus A (solid line) changes in the
absence of any change in inputs for stimulus A (most evident in the leftpanel).
https://doi.org/10.1371/journal.pcbi.1009012.g005
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 9 / 25
In conclusion, Fig 5 demonstrates differences between the behavior of the intrinsically fast-
switching “entice-to-stay” network and that of the intrinsically slow-switching “repel-to-leave”
network, with strong competition between bout durations only possible in the “entice-to-stay”
class of networks. Moreover, our modeling reveals an asymmetry in the nature of competition,
whereby a highly hedonic alternative stimulus reduces the duration of sampling bouts, but a
highly aversive alternative stimulus produces bout durations that are little different from those
with a neutral alternative stimulus, or no alternative at all.
Network transition dynamics
One of our modeling goals was to assess whether these two classes of network show systematic
differences in neural activity, in addition to the behaviorally observable quantitative differences
in the degrees of competition shown in Fig 5. One structural difference has to do with activity in
the circuit causes a transition away from stimulus sampling. In the case of the intrinsically fast-
switching “entice-to-stay” networks, this transition is caused by a dip of input to the stay popula-
tion, whereas in intrinsically slow-switching “repel-to-leave” networks the cause is a rise of input
to the leave population. Given that neurons in any circuit are strongly interconnected in
dynamic systems such as those described here, it is not clear that any change in one set of neu-
rons could be measured independently of changes in other sets of neurons. However, we hypoth-
esized that by aligning neural activity to the state-transition points we might make progress in
uncovering distinctive signatures in the neural dynamics of our models that would allow us to
distinguish the two classes of network using electrophysiological data.
Fig 6 presents such transition-aligned activity in examples of intrinsically slow-switching
(entice-to-stay) networks (left) and intrinsically fast-switching (repel-to-leave) networks
(right). While many differences between these examples (e.g. net firing rates), proved inconsis-
tent across broader swaths of the two classes of models, two distinguishing features of the pre-
transition activity are reliable: First, whereas in “entice-to-stay” networks (left), activity in the
excitatory cells that cause the transition (E-switch, red) increases gradually during the 200ms
before leave decisions, in “repel-to-leave” networks (right) E-switch activity remains low until
only a few tens of milliseconds before the decision (at which point there is a sudden and rapid
increase); second, the neurons inhibiting the transition (I-switch cells, purple) begin oscillating
in the pre-transition period for “repel-to-leave” networks, but the same does not occur in
“entice-to-stay” networks. These observations suggest measured neural activity (such as that
acquired from gustatory cortex) could enable us to determine which class of network is active
in a rodent’s brain, if aligned to the time-point ending a bout of sampling.
Discussion
It is worth comparing and contrasting the model we present here of stay-versus-switch decision
making, based on a naturalistic task, with other models of decision making in systems neurosci-
ence [11,13,14,2531]. In particular, over the past few decades, much investment has been spent
in studying winner-takes-all models of decision making that underlie the behavior of two-alter-
native forced choice tasks [25,29,32], usually in the context of perceptual decisions [33,34]. The
framework of such behavioral tasks is trial-based, where the experimenter chooses the stimuli,
and the subject makes a single, final choice in each trial. The relative preference for one alterna-
tive can be obtained after accumulating many discrete trials. Conversely, in naturalistic tasks
such as food preference tests or foraging [1,35], the subject selects the stimulus, but the selection
is not final. The choice is inextricably linked to the stimulus and is ongoing and dynamic.
Moreover, models of the two types of decision-making tasks must be distinct in two ways.
First, any competition between stimuli must occur across time in a naturalistic task, since only
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 10 / 25
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 11 / 25
one stimulus is present at any given time. Second, and importantly, the property of each of the
two stimuli being evaluated is likely to provide input to the decision-making circuitry via the
same neural pathway. Whereas in the most common perceptual decision-making paradigms,
and therefore in models thereof, the two competing inputs correspond to opposing stimuli
(such as directions of motion) and so excite different groups of neurons. In such perceptual
tasks, the competition between stimuli can arise via a direct inhibitory interaction between the
simultaneously active distinct groups of neurons. By contrast, in food preference tests the
hedonic, appetitive neural response to one stimulus must compete with a similar hedonic,
appetitive neural response to a temporally separated stimulus. Therefore, these naturalistic
decisions have some similarity to parametric working memory tasks [19,20,36,37] where sub-
jects compare two different levels of input via the same afferent pathway.
Our model is state-based, behaving similarly to models of bistable percepts [6,38] in that
the neural activity can transition at a stimulus-related but non-deterministic time from one
state to another, even if the stimulus does not change. Evidence for such noise-induced transi-
tions between otherwise stable states has been obtained across a number of neural systems
[5,3944], even where trial-averaged activity may suggest a slower ramping akin to evidence
accumulation [10]. Indeed, a similar state transition may underlie the more commonly studied
perceptual decision-making tasks [9].
While our model does not designate the location of the simulated circuit, we assume all
neurons in our model are located in the same region and are likely to be found in anterior cin-
gulate cortex, where neurons represent value in self-paced decisions [45], or a subregion of the
ventromedial prefrontal cortex, such as the infralimbic cortex, whose activity is needed for an
animal to engage in feeding behavior [46].
The nature of competition
Decision making inevitably involves competition in that choices have opportunity cost. The
more often one alternative is chosen, the less often another alternative can be chosen. In win-
ner-take-all networks the competition is inherent in the network’s structure and a binary deci-
sion arises on a trial-by-trial basis [14,25]. That is, at the level of neural activity, increased
spiking of one group of cells inhibits the spiking of other groups [11,14]. At the level of behav-
ior one choice terminates the trial and prevents any other choice.
By contrast, in a food preference test implicit competition can arise in the absence of any
correlation between the neural activity in response to one stimulus versus the other. This is
simply due to the finite amount of time available, such that if one stimulus is extremely
hedonic the animal will rarely leave it, so inevitably consume less of any alternative stimulus.
That is, even if the bout duration of an alternative stimulus is unchanged, an animal would
leave a more hedonic stimulus less frequently to visit that alternative, so consume less of the
alternative than if that alternative were paired with a less hedonic stimulus.
Theoretically, it is possible for a highly hedonic stimulus to boost the hedonic value of an
alternative stimulus while still maintaining such implicit competition between the total
amounts of the two foods being consumed. Such behavior could arise if, for example, the neu-
ral activity were dominated by a slow synaptic facilitation such that the impact of a highly
Fig 6. Neural activity preceding transitions to switch away to a new stimulus. A transition is detected when synaptic output of the E-
switch neurons significantly exceeds that of the E-stay neurons, so the transition time (0 ms) is inevitably marked by a spike in activity of
the E-switch neurons. A decline in activity of neurons that promote the “stay” (E-stay, blue) or inhibit the switch (I-switch, purple)
combined with a rise in activity of neurons that promote the switch (E-switch, red) or inhibit the stay (I-stay, yellow) precedes the
detected transition. Left: the inherently fast-switching, “entice-to-stay” networks. Right: the inherently slow-switching “repel-to-leave”
networks.
https://doi.org/10.1371/journal.pcbi.1009012.g006
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 12 / 25
hedonic response lingered so as to boost responses to later stimuli. However, to align with the
qualitative nature of the marginal value theorem [2,47]—one location is more easily left if the
alternatives are better—we incorporated synaptic depression as the dominant effect of the net-
work on a timescale of many seconds.
Synaptic depression—an inevitable consequence of synaptic transmission as vesicles are
depleted—causes a reduction in the hedonic response to any stimulus that follows a highly
hedonic stimulus. In this manner, synaptic depression accentuates competition between two
taste stimuli in a food preference test, because the mean bout duration when sampling the less
hedonic stimulus is shortened in the presence of a highly hedonic alternative stimulus. Such
competition based on a trace-memory of prior stimuli is in addition to the implicit competi-
tion already present in a food preference task. We find that the reduction in bout durations in
the presence of a highly hedonic alternative is most pronounced, with a ten-fold reduction pos-
sible, in the “entice to stay” networks we simulated here.
Our goal here is to assess what behavioral phenomena arise from a very simple decision-
making circuit when common neuronal features are incorporated. The short-term synaptic
depression we include is one of many biological processes that sculpt network activity over
multiple timescales. Short-term depression allows our network to be reactive to the recent
past, so could account for anhedonia in a short period following a strongly rewarding stimulus
[48]. However, it is insufficient to produce the longer-term learning needed for an animal to
anticipate the future based on past experience, as in the anticipatory contrast effect [49]. It is
intriguing however, that the basic phenomenon of the devaluing of a rewarding stimulus (such
as saccharine) in circumstances when a more highly rewarding or addictive stimulus is avail-
able or anticipated (such as cocaine or sucrose)[5052], arises in our relatively simple circuit.
Our networks reproduce many effects underlying the marginal value theorem [2,47], which
provides a framework for behavior observed in the foraging literature [35]. According to the
marginal value theorem, an animal stays at a patch of food until its rate of return diminishes to
the average rate of return obtained by moving from site to site across the environment. While we
do not simulate an experiment with reduced stimulus over time during a bout, our model does
reproduce some of the qualitative features essential in any neural circuit model of such foraging
dynamics. First, the more hedonic/palatable the current stimulus, the greater the bout duration
at that stimulus. Second, the more hedonic/palatable the alternative stimulus, the shorter the
bout duration at the first stimulus. Third, the greater the time in the switching period between
stimuli, the less the impact of a more hedonic alternative. Importantly, the third effect mitigates
the second effect; longer times spent between stimuli will mitigate shorter sampling bouts in the
presence of superior alternatives. Such reluctance to switch to more favorable alternatives arises
from the cost of switching in value-based models, because animals do not accumulate reward
while traveling between samples. Here we show that such behaviors, desirable from the point of
view of optimality, arise qualitatively in a simple circuit with depressing synapses.
Methods
Properties of model neurons
Individual neurons were simulated with an exponential leaky integrate-and-fire model [53]
following the equation:
Cm
dVm
dt ¼ElVmþDthexp VmVth
Dth
 
RmþGsynSIErevIVm
 þGsynSEErevEVm
 
þGref EKVm
ð Þ þ GextIErevIVm
 þGextEErevEVm
 
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 13 / 25
where V
m
is the membrane potential, C
m
is the total membrane capacitance, E
l
is the leak
potential, R
m
is the total membrane resistance, Δ
th
is the spiking range, V
th
is the spiking
threshold, Sis the synaptic input variable, G
syn
and E
rev
are the maximal conductance and
reversal potential for synaptic connections, G
ref
is the dynamic refractory conductance, E
K
is
the potassium reversal potential, and G
ext
is the input conductance. The “E” and “I” subscripts
denote the variables specific to excitatory and inhibitory channels, respectively (e.g. S
E
and
ErevEare the synaptic input and reversal variables for excitatory channels; S
I
and ErevIare the
corresponding inhibitory variables). This equation simulates the neuron’s membrane potential
until V
m
>V
spike
, at which point the neuron spikes.
When a neuron spikes, V
m
is set to the V
reset
value. Additionally, the neuron’s refractory
conductance, synaptic output, s, and synaptic depression (noted as D) are updated according
to the equations:
Gref 7!Gref þDGref
s7!sþpRDfastð1sÞ
Dfast7!Dfast ð1pRÞ
where ΔG
ref
is the increase in refractory conductance, and p
R
is the vesicle release probability
following a spike.
In the timestep immediately following a spike, the neuron’s membrane potential continues
to follow the exponential leaky integrate-and-fire model equation. In this equation the separate
excitatory (S
E,i
) and inhibitory (S
I,i
) synaptic inputs for cell iare obtained from the sum of all
presynaptic outputs multiplied by the corresponding connection strengths, W
ij
, from neurons
j(see Network architecture and connections):
Si¼X
j
Wijsj;
each of which decay with the appropriate (excitatory or inhibitory) synaptic gating time con-
stant τ
S
according to:
dsi
dt ¼  si
tS
:
Likewise, refractory conductance decays with the time constant τ
ref
according to:
dGref
dt ¼  Gref
tref
The G
ext
input conductance serves as both noisy-background and stimulus inputs in the same
manner. Inputs were modeled as Poisson spike trains with rates r
noise
and r
stimulus
, which pro-
duce input spikes (from all sources) at timepoints {t
sp
}. Please note, the noisy-background
includes both excitatory and inhibitory spiking input (included in GextIand GextE, respectively);
the r
noise
parameter specifies the rate for both excitatory and inhibitory background noise. The
input conductance values for a given timepoint, t, are updated as:
Gext7!Gext þDGext dðttspÞ
where the conductance increases by ΔG
ext
at the time of each input spike. The input
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 14 / 25
conductance otherwise decays with the time constant τ
ext
according to:
dGext
dt ¼  Gext
text
:
The cellular parameters with values specific to excitatory neurons (e.g. that differ from
inhibitory values) are: ErevE¼0mV;τ
s
= 50 ms, and τ
ext
= 3.5 ms. The complementary values
for inhibitory neurons are: ErevI¼  70 mV;τ
s
= 10 ms, and τ
ext
= 2 ms. The remaining param-
eters applicable to both excitatory and inhibitory neurons are: G
syn
= 10 nS,p
R
= .1, τ
fast
= 300
ms,τ
slow
= 7 s,p
slow
= .5, E
l
=70 mV,E
K
=80 mV,V
reset
=80 mV,R
m
= 100 MO,C
m
= 100
pF,V
spike
= 20 mV,ΔG
ext
= 1 nS,V
th
=50 mV,Δ
th
= 2 mV,τ
ref
= 25 ms, and ΔG
ref
= 12.5 nS.
The Poisson spike-train parameters r
noise
and r
stimulus
are described in the next section. Neu-
rons were simulated with a simulation timestep dt = .1 ms.
Synaptic depression
We modeled synaptic depression using two separate timescales, noted in the previous spike-
update equations as D
slow
and D
fast
. These two variables reflect, respectively, the fraction of the
maximum number of vesicles available in the reserve pool and the release-ready pool. Follow-
ing a spike, the variables recover to a value of one with different timescales, because vesicles
regenerate and are replenished slowly in the reserve pool, but may dock and become release-
ready much more quickly once available in the reserve pool (Fig 7).
Specifically, D
slow
represents the ratio of currently available reserve-pool vesicles out of the
maximum possible, that is Dslow ¼Npool
Nmax (Fig 7A). These dock quickly at empty docking sites on
the timescale τ
fast
(Fig 7C), but are replaced slowly on the timescale τ
slow
.D
fast
represents the
ratio of docked vesicles out of total docking sites, that is Dfast ¼Ndocked
Nsites (Fig 7B). We also incorpo-
rate the constant parameter, f
D
= 0.05, which is equal to the ratio of the number of docking
sites to the maximum size of the reserve pool of vesicles, fD¼Nsites
Nmax. Only docked vesicles can be
released immediately following a spike, such that upon each spike we update D
fast
7!D
fast
(1
p
R
) where p
R
is the vesicle release probability.
During sustained spiking, the fast-docking can maintain a firing-rate dependent supply of
docked vesicles until the reserve pool (Fig 7B) depletes. Vesicles dock at empty sites (Fig 7C)
according to:
dDfast
dt ¼ðDslow DfastÞ
tfast
Reserve-pool vesicles fill the empty docking sites on the fast timescale τ
fast
. On the other hand,
the reserve-pool regenerates much more slowly according to:
dDslow
dt ¼ð1DslowÞ
tslow fDðDslow DfastÞ
tfast
The first term represents the reserve-pool vesicle regeneration on timescale τ
slow
. The second
term fDðDslowDfast Þ
tfast accounts for the vesicles lost due to docking (Fig 7C).
Our model reflects the empirical evidence showing the effects of synaptic-depression at
short timescales on the order of milliseconds, and longer timescales on the order of seconds
[54,55]; depression timescales on the order of minutes have even reported in non-mammalian
animals [56]. Additional, recent evidence [57] directly supports our fast-depression
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 15 / 25
mechanism where available vesicles quickly refill empty docking sites. Our model provides a
coherent mechanism for both fast-acting and long-lasting synaptic depression effects.
Network architecture and connections
Each network consists of 250 individual neurons, split into two populations of 100 excitatory
cells (i.e., “stay” and “switch” populations, E
stay
and E
switch
) and two populations of 25 inhibi-
tory cells (I
stay
and I
switch
). Fig 1A depicts the basic architecture of all networks simulated in
this paper, with connections within and between populations indicated. For each pair of con-
nected populations (or for self-connected excitatory populations) pairs of cells were connected
probabilistically with a probability, P(connection) = .5. The strength of connections were sym-
metric across “stay” and “switch” populations, but depended on whether presynaptic or post-
synaptic cells were excitatory or inhibitory, as indicated in Table 1.
Code availability
The code used to simulate our model is freely available online at https://github.com/
johnksander/naturalistic-decision-making
Fig 7. Synaptic depression model. Vesicles are either in the reserve pool (a) or at docking sites (b). Available vesicles
dock quickly at empty docking sites (c), but take much longer to regenerate once their internal neurotransmitter is
released (d).
https://doi.org/10.1371/journal.pcbi.1009012.g007
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 16 / 25
Table 1. Model neuron parameters.
Name Description value
E
rev
Reversal potential Excitatory cells: 0 mV Inhibitory cells: 70 mV
E
l
Leak potential 70 mV
E
K
Potassium potential 80 mV
R
m
Membrane resistance 100 MO
C
m
Membrane capacity 100 pF
τ
s
Synaptic gating timescale Excitatory cells: 50 ms Inhibitory cells: 10 ms
V
reset
Reset membrane potential 80 mV
V
spike
Spike threshold 20 mV
τ
ext
noisy-background conductance timescale Excitatory cells: 3.5 ms Inhibitory cells: 2 ms
G
syn
Synaptic max conductance 10 nS
τ
fast
Fast depression timescale (Fig 7C) 300 ms
τ
slow
Slow depression timescale (Fig 7D) 7 s
p
R
Vesicle release probability .1
f
D
Ratio of max docked vesicles to max pooled vesicles .05
D
fast
Ratio of docked vesicles out of total possible
(Fig 7B)
Ndocked
Nsites
D
slow
Ratio of reserve-pool vesicles out of the total possible (Fig 7A)Npool
Nmax
ΔG
ext
Conductance step-increase to external input spike 1 nS
V
th
exponential spiking-term threshold 50 mV
Δ
th
spiking range 2 mV
τ
ref
Refractory conductance timescale 25 ms
ΔG
ref
Step change in refractory conductance 12.5 nS
dt Simulation timestep .1 ms
A Model summary
Populations Stay: 1 excitatory, 1 inhibitory
Leave: 1 excitatory, 1 inhibitory
Connectivity Within-pool (stay or leave): I-to-E and recurrent E-to-E
Cross-pool (stay-to-leave or leave-to-stay): E-to-I
Neuron model Exponential Leaky Integrate and Fire (ELIF) with dynamic refractory conductance
Synapse model Conductance based, step increase followed by exponential decay
Plasticity Depression with two timescales
Input Noisy background input: fixed-rate Poisson spike trains to all cells
Stimuli: Poisson spike trains to E-stay and E-leave cells
Measurements Spike trains, activity state-durations, connection strengths
B Populations
Name Elements Size
E-stay ELIF neurons 100
I-stay ELIF neurons 25
E-leave ELIF neurons 100
I-leave ELIF neurons 25
Noisy background input Poisson trains 500
Aversive stimulus Poisson trains 100
Hedonic stimulus Poisson trains 100
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 17 / 25
C Connectivity
Name Source Target Pattern
E-to-I E-stay
E-leave
I-leave
I-stay
Random, p= .5, model-dependent fixed weight:
Circle (open) 0.2955
Circle (closed) 0.0833
Square (open) 0.4242
Square (closed) 0.0909
Up-triangle (open) 0.75
Up-triangle (closed) 0.75
Diamond (open) 0.4773
Diamond (closed) 0.4621
Down-triangle (open) 0.4697
Down-triangle (closed) 0.1742
I-to-E I-stay
I-leave
E-stay
E-leave
Random, p= .5, model-dependent weight:
Circle (open) 12.3747
Circle (closed) 12.3747
Square (open) 9.4939
Square (closed) 9.6192
Up-triangle (open) 8.4919
Up-triangle (closed) 3.6071
Diamond (open) 9.4939
Diamond (closed) 3.6071
Down-triangle (open) 8.8677
Down-triangle (closed) 4.2333
E-to-E E-stay, E-leave E-stay, E-leave Random, p= .5, fixed weight, W
EE
= 0.0405
D Neuron and Synapse Model
Name Laf neuron
Type Dynamic leaky integrate-and-fire with dynamic refractory conductance
Subthreshold dynamics
CmdVm
dt ¼ELVmþDthexp VmVth
Dth
 
RmþGsyn SIErevIVm
 þGsyn SEErevEVm
 þGref EKVm
ð Þ þ GextIErevIVm
 þGextEErevEVm
 
dGref
dt ¼  Gref
tref
dGext
dt ¼  Gext
text
Spiking If V
m
>V
spike
:
1. Emit spike with timestamp t
2. G
ref
7!G
ref
+ΔG
ref
3. V
m
7!V
reset
Synapse Si¼X
j
Wijsj
following a spike by neuron i:
s
i
7!s
i
+p
R
D
fast
(1s
i
)
D
fast,i
7!D
fast,i
(1p
R
)
Between spikes:
dsi
dt ¼  si
tS
dDfast;i
dt ¼ðDslow;iDfast;iÞ
tfast
dDslow;i
dt ¼ð1Dslow;iÞ
tslow fDðDslow;iDfast;iÞ
tfast
(Continued)
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 18 / 25
F Input
Type Description
All external spiking input Input spikes increase conductance: G
ext
7!G
ext
+BΔG
ext
Conductance G
ext
decays:
dGext
dt ¼  Gext
text
Background noisy input One excitatory spike-train per neuron, and one inhibitory spike-train per neuron (all 1540 Hz Poisson spike-trains).
Aversive stimulus One excitatory spike-train per E-leave neuron (fixed rate).
Hedonic stimulus One excitatory spike-train per E-stay neuron (fixed rate).
G Measurements
Active state: when mean difference between E-stay and E-leave excitatory synaptic gating exceeds .02 for 50ms (consecutively).
State duration/sampling duration: time between state transitions (i.e. transitioning from E-stay to E-leave active state).
https://doi.org/10.1371/journal.pcbi.1009012.t001
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 19 / 25
Network states and stimuli
A network’s active state was evaluated by comparing the mean values of synaptic output, s
E
,
averaged across all excitatory cells in each of the two excitatory populations. Specifically, when
the difference between the mean output of the previously less active excitatory population
exceeded that of the previously more active excitatory population by a threshold of 0.02 consis-
tently for 50ms, we recorded a state change. The threshold was chosen to be.02 as a value
which robustly captured the intended quasi-bistable behavior across multiple networks. If a
simulation produced one second of consecutive timepoints without an active pool (i.e.,
because the difference in activity of the two excitatory populations was too small to produce
above-threshold differences in synaptic output) the simulation was terminated and the corre-
sponding parameter set was not used for further analysis. We only analyzed simulations of net-
works exhibiting quasi-bistability
In our simulations of preference tests, we did not simulate the animal’s behavior in between
bouts of sampling a stimulus. Once the excitatory neurons in the “switch” population (E-
switch cells) were recorded as more active than those in the “stay” population, using the
threshold mentioned above, we removed the stimulus input to the network. 100 ms later, we
induced a subsequent transition back to the “stay” state to represent the animal initiating a
new bout of stimulus sampling. The transition back to sampling was accomplished by halving
the noisy background input to E-switch cells until the network transitioned again to the “stay”
state. At all other times in simulations, the noisy background input remained constant. Once a
transition to the “stay” state was recorded (by excitatory cells in the “stay” population being
more active than those in the “switch” population) input stimulus was applied to indicate the
next bout of sampling.
Parameter sweep and example networks
We obtained a set of examplar networks by varying I-to-E and E-to-I connection strengths
and recording the average active-state durations. Networks were simulated without stimulus
(but with noisy-background input) for 1500 seconds. We then chose five pairings of intrinsi-
cally fast-transitioning and intrinsically slow-transitioning networks from this parameter
space. The fast and slow networks represent the “entice-to-stay” and “repel-to-leave” decision-
making accounts, respectively. Networks were paired across the two classes by selecting exam-
ples with connection strengths as closely matched as possible, while average state-durations
were either end of the wide range recorded. Typically, fast networks transitioned in a few sec-
onds, whereas slow networks transitioned in a few minutes. We performed all subsequent sim-
ulations with 10 exemplar networks (five pairs) as represented by the symbols in Fig 2B.
For each of the 10 exemplar networks, we determined the level of stimulus necessary to
equate the baseline sampling behavior, producing an average duration of 7.5 seconds for the
“stay” state in each case. Increasing the strength of hedonic stimuli (increasing E-stay spiking
input) slowed the fast network transitions, while increasing the strength of aversive stimuli
(increasing E-leave spiking input) quickened slow network transitions. The specific intensities
were found with Matlab’s fminbnd() optimizer function. We will refer to these stimulus values
as the networks’ “baseline stimuli”, as we used these values as reference points for subsequent
simulations.
Taste preference task simulations
Individual taste preference task simulations lasted 1500 seconds total. Each simulation com-
pared sampling bout durations in response to two stimuli (A and B) each with a fixed value
across the session. The input representing stimulus A was equal to the network’s baseline
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 20 / 25
stimulus value, and constant across sessions. The input representing stimulus B was systemati-
cally varied across sessions, such that in each session it was fixed at a value between zero and
sixty times the input for stimulus A. For a given network the stimulus inputs targeted the same
population for all sessions. That is, both stimuli were always hedonic for intrinsically fast-
switching “entice-to-stay” networks, and both stimuli were always aversive for intrinsically
slow-switching “repel-to-leave” networks. The stimulus value (A or B) alternated after each
stay-to-leave transition. Simulations always began with stimulus A during the first stay-state,
followed by stimulus B during the second stay-state, etc. This represents the animal alternating
between the available stimuli after “leave” decisions.
Supporting information
S1 Fig. Firing rates of active cells in the “stay” state as a function of network connection
strengths. (Left) Firing rate of excitatory cells is in the vicinity of 10 Hz for all parameters lead-
ing to two quasi-stable network states. (Right) Firing rate of inhibitory cells varies strongly as a
function of parameters, but note that the range of rates is similar for entice-to-stay networks
(left edge) and repel-to-leave networks (right edge) so firing rate is not a clear indicator of type
of network.
(TIFF)
S2 Fig. State durations in an alternative network to indicate robustness of the system to
incorporation of additional connections. The results are produced for a circuit depicted in
A) with additional connections (excitatory between excitatory pools and inhibitory both
within and between inhibitory pools) all at half the strength of the excitatory within-pool con-
nections of the standard network. B) Results indicate the same network responses as in the
original network are possible given small compensatory shifts in the inhibitory-to-excitatory
and excitatory-to-inhibitory connections (compare main manuscript, Fig 2).
(EPS)
S3 Fig. Competitive interaction in taste preference tests. All network results are shown, with
marker symbols identifying the network parameters as depicted in Fig 2 panel B (main text
Fig). Results for all of the intrinsically fast-switching (entice-to-stay) networks exhibited a
stronger impact on the duration of bouts at the stimulus of fixed input, A, as the strength of
input from the alternative stimulus, B, was adjusted to reflect more hedonic input.
(TIFF)
S4 Fig. Impact of varying one stimulus input on sampling durations. In all panels, input
corresponding to stimulus B is adjusted across sessions, while input corresponding to stimulus
A is held fixed. Mean bout durations in response to stimulus B are shown as dashed lines and
increase with increasing input in entice-to-stay networks (left panels), while they decrease with
increasing input in repel-to-leave networks (right panels). Mean bout durations in response to
stimulus A (which never changes across sessions) are shown as solid lines. Competition arises
as durations for A (solid lines) decrease when durations for B (dotted lines) increase, with a
much stronger effect in all of the entice-to-stay networks (left panels).
(TIFF)
Author Contributions
Conceptualization: Donald B. Katz, Paul Miller.
Data curation: John Ksander.
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 21 / 25
Formal analysis: John Ksander.
Funding acquisition: Donald B. Katz, Paul Miller.
Methodology: Paul Miller.
Software: John Ksander.
Supervision: Donald B. Katz, Paul Miller.
Visualization: John Ksander.
Writing – original draft: John Ksander.
Writing – review & editing: Donald B. Katz, Paul Miller.
References
1. Pearson JM, Watson KK, Platt ML. Decision making: the neuroethological turn. Neuron. 2014; 82
(5):950–65. https://doi.org/10.1016/j.neuron.2014.04.037 PMID: 24908481; PubMed Central PMCID:
PMC4065420.
2. Charnov EL. Optimal Foraging, the Marginal Value Theorem. Journal of Theoretical Population Biology.
1976; 9(2):129–34. https://doi.org/10.1016/0040-5809(76)90040-x PMID: 1273796
3. Cowie RJ. Optimal foraging in great tits (Parsus major). Nature. 1977; 268:137–9.
4. Hayden BY, Pearson JM, Platt ML. Neuronal basis of sequential foraging decisions in a patchy environ-
ment. Nat Neurosci. 2011; 14(7):933–9. https://doi.org/10.1038/nn.2856 PMID: 21642973; PubMed
Central PMCID: PMC3553855.
5. Miller P. Itinerancy between attractor states in neural systems. Curr Opin Neurobiol. 2016; 40:14–22.
https://doi.org/10.1016/j.conb.2016.05.005 PMID: 27318972; PubMed Central PMCID: PMC5056802.
6. Moreno-Bote R, Rinzel J, Rubin N. Noise-induced alternations in an attractor network model of percep-
tual bistability. J Neurophysiol. 2007; 98(3):1125–39. https://doi.org/10.1152/jn.00116.2007 PMID:
17615138; PubMed Central PMCID: PMC2702529.
7. Kanai R, Moradi F, Shimojo S, Verstraten FA. Perceptual alternation induced by visual transients. Per-
ception. 2005; 34(7):803–22. Epub 2005/08/30. https://doi.org/10.1068/p5245 PMID: 16124267.
8. Jones LM, Fontanini A, Sadacca BF, Miller P, Katz DB. Natural stimuli evoke dynamic sequences of
states in sensory cortical ensembles. Proc Natl Acad Sci U S A. 2007; 104(47):18772–7. https://doi.org/
10.1073/pnas.0705546104 PMID: 18000059; PubMed Central PMCID: PMC2141852.
9. Latimer KW, Yates JL, Meister ML, Huk AC, Pillow JW. NEURONAL MODELING. Single-trial spike
trains in parietal cortex reveal discrete steps during decision-making. Science. 2015; 349(6244):184–7.
https://doi.org/10.1126/science.aaa4056 PMID: 26160947.
10. Sadacca BF, Mukherjee N, Vladusich T, Li JX, Katz DB, Miller P. The Behavioral Relevance of Cortical
Neural Ensemble Responses Emerges Suddenly. J Neurosci. 2016; 36(3):655–69. https://doi.org/10.
1523/JNEUROSCI.2265-15.2016 PMID: 26791199.
11. Miller P, Katz DB. Stochastic transitions between neural states in taste processing and decision-mak-
ing. J Neurosci. 2010; 30(7):2559–70. Epub 2010/02/19. 30/7/2559 [pii] https://doi.org/10.1523/
JNEUROSCI.3047-09.2010 PMID: 20164341.
12. Miller P, Katz DB. Stochastic Transitions between States of Neural Activity. In: Ding M, Glanzman DL,
editors. The Dynamic Brain: An Exploration of Neuronal Variability and Its Functional Significance. New
York, NY: Oxford University Press; 2011. p. 29–46.
13. Miller P, Katz DB. Accuracy and response-time distributions for decision-making: linear perfect integra-
tors versus nonlinear attractor-based neural circuits. Journal of Computational Neuroscience. 2013; 35
(3):261–94. Epub 2013/04/24. https://doi.org/10.1007/s10827-013-0452-x PMID: 23608921.
14. Wang XJ. Probabilistic decision making by slow reverberation in cortical circuits. Neuron. 2002;
36:955–68. https://doi.org/10.1016/s0896-6273(02)01092-9 PMID: 12467598
15. Davis JD. Deterministic and probabilistic control of the behavior of rats ingesting liquid diets. Am J Phy-
siol. 1996; 270(4 Pt 2):R793–800. Epub 1996/04/01. https://doi.org/10.1152/ajpregu.1996.270.4.R793
PMID: 8967409.
16. Kramers HA. Brownian motion in a field of force and the diffusion model of chemical reactions. Physica.
1940; 7:284–304.
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 22 / 25
17. Sadacca BF, Rothwax JT, Katz DB. Sodium concentration coding gives way to evaluative coding in cor-
tex and amygdala. Journal of Neuroscience. 2012; 32:9999–10011. PubMed Central PMCID:
PMC3432403. https://doi.org/10.1523/JNEUROSCI.6059-11.2012 PMID: 22815514
18. Brody CD, Hernandez A, Zainos A, Lemus L, Romo R. Analysing neuronal correlates of the comparison
of two sequentially presented sensory stimuli. Philosophical transactions of theRoyal Society of London
Series B, Biological sciences. 2002; 357(1428):1843–50. Epub 2003/03/11. https://doi.org/10.1098/
rstb.2002.1167 PMID: 12626017; PubMed Central PMCID: PMC1693076.
19. Machens CK, Romo R, Brody CD. Flexible control of mutual inhibition: a neural model of two-interval
discrimination. Science. 2005; 307(5712):1121–4. Epub 2005/02/19. https://doi.org/10.1126/science.
1104171 PMID: 15718474.
20. Miller P, Wang XJ. Inhibitory control by an integral feedback signal in prefrontal cortex: a model of dis-
crimination between sequential stimuli. Proc Natl Acad Sci U S A. 2006; 103(1):201–6. https://doi.org/
10.1073/pnas.0508072103 PMID: 16371469; PubMed Central PMCID: PMC1324991.
21. Pearson JM, Hayden BY, Raghavachari S, Platt ML. Neurons in posterior cingulate cortex signal explor-
atory decisions in a dynamic multioption choice task. Curr Biol. 2009; 19(18):1532–7. https://doi.org/10.
1016/j.cub.2009.07.048 PMID: 19733074; PubMed Central PMCID: PMC3515083.
22. Hirokawa J, Vaughan A, Masset P, Ott T, Kepecs A. Frontal cortex neuron types categorically encode
single decision variables. Nature. 2019; 576(7787):446–51. Epub 2019/12/06. https://doi.org/10.1038/
s41586-019-1816-9 PMID: 31801999.
23. Wikenheiser AM, Stephens DW, Redish AD. Subjective costs drive overly patient foraging strategies in
rats on an intertemporal foraging task. Proc Natl Acad Sci U S A. 2013; 110(20):8308–13. https://doi.
org/10.1073/pnas.1220738110 PMID: 23630289; PubMed Central PMCID: PMC3657802.
24. Monk KJ, Rubin BD, Keene JC, Katz DB. Licking microstructure reveals rapid attenuation of neophobia.
Chem Senses. 2014; 39(3):203–13. Epub 2013/12/24. https://doi.org/10.1093/chemse/bjt069 PMID:
24363269; PubMed Central PMCID: PMC3921893.
25. Wang XJ. Decision making in recurrent neuronal circuits. Neuron. 2008; 60(2):215–34. Epub 2008/10/
30. https://doi.org/10.1016/j.neuron.2008.09.034 PMID: 18957215; PubMed Central PMCID:
PMC2710297.
26. Bogacz R. Optimal decision-making theories: linking neurobiology with behaviour. Trends Cogn Sci.
2007; 11(3):118–25. Epub 2007/02/06. S1364-6613(07)00029-0 [pii] https://doi.org/10.1016/j.tics.
2006.12.006 PMID: 17276130.
27. Wong KF, Huk AC. Temporal Dynamics Underlying Perceptual Decision Making: Insights from the
Interplay between an Attractor Model and Parietal Neurophysiology. Frontiers in neuroscience. 2008; 2
(2):245–54. Epub 2009/02/20. https://doi.org/10.3389/neuro.01.028.2008 PMID: 19225598; PubMed
Central PMCID: PMC2622760.
28. Wong KF, Wang XJ. A recurrent network mechanism of time integration in perceptual decisions. The
Journal of neuroscience: the official journal of the Society for Neuroscience. 2006; 26(4):1314–28.
Epub 2006/01/27. https://doi.org/10.1523/JNEUROSCI.3733-05.2006 PMID: 16436619.
29. Ratcliff R, Smith PL, Brown SD, McKoon G. Diffusion Decision Model: Current Issues and History.
Trends Cogn Sci. 2016; 20(4):260–81. Epub 2016/03/10. https://doi.org/10.1016/j.tics.2016.01.007
PMID: 26952739; PubMed Central PMCID: PMC4928591.
30. Ratcliff R. A theory of memory retrieval. Psychological Review. 1978; 85:59–108.
31. Usher M, McClelland JL. The time course of perceptual choice: the leaky, competing accumulator
model. Psychol Rev. 2001; 108(3):550–92. Epub 2001/08/08. https://doi.org/10.1037/0033-295x.108.3.
550 PMID: 11488378.
32. Liu YS, Holmes P, Cohen JD. A neural network model of the Eriksen task: reduction, analysis, and data
fitting. Neural Computation. 2008; 20(2):345–73. Epub 2007/11/30. https://doi.org/10.1162/neco.2007.
08-06-313 PMID: 18045022; PubMed Central PMCID: PMC2749974.
33. Shadlen MN, Newsome WT. Motion perception: seeing and deciding. Proceedings of the National
Academy of Sciences of the United States of America. 1996; 93(2):628–33. Epub 1996/01/23. https://
doi.org/10.1073/pnas.93.2.628 PMID: 8570606; PubMed Central PMCID: PMC40102.
34. Shadlen MN, Newsome WT. Neural basis of a perceptual decision in the parietal cortex (area LIP) of
the rhesus monkey. Journal of Neurophysiology. 2001; 86(4):1916–36. Epub 2001/10/16. https://doi.
org/10.1152/jn.2001.86.4.1916 PMID: 11600651.
35. Adams GK, Watson KK, Pearson J, Platt ML. Neuroethology of decision-making. Curr Opin Neurobiol.
2012; 22(6):982–9. https://doi.org/10.1016/j.conb.2012.07.009 PMID: 22902613; PubMed Central
PMCID: PMC3510321.
36. Romo R, Brody CD, Hernandez A, Lemus L. Neuronal correlates of parametric working memory in the
prefrontal cortex. Nature. 1999; 399(6735):470–3. https://doi.org/10.1038/20939 PMID: 10365959.
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 23 / 25
37. Jun JK, Miller P, Hernandez A, Zainos A, Lemus L, Brody CD, et al. Heterogenous population coding of
a short-term memory and decision task. The Journal of neuroscience: the official journal of the Society
for Neuroscience. 2010; 30(3):916–29. Epub 2010/01/22. https://doi.org/10.1523/JNEUROSCI.2062-
09.2010 PMID: 20089900; PubMed Central PMCID: PMC2941889.
38. Rankin J, Sussman E, Rinzel J. Neuromechanistic Model of Auditory Bistability. PLoSComput Biol.
2015; 11(11):e1004555. Epub 2015/11/13. 10.1371/journal.pcbi.1004555. https://doi.org/10.1371/
journal.pcbi.1004555 PMID: 26562507; PubMed Central PMCID: PMC4642990.
39. Daelli V, Treves A. Neural attractor dynamics in object recognition. Exp Brain Res. 2010; 203(2):241–8.
https://doi.org/10.1007/s00221-010-2243-1 PMID: 20437171.
40. Sigala N, Logothetis NK. Visual categorization shapes feature selectivity in the primate temporal cortex.
Nature. 2002; 415(6869):318–20. https://doi.org/10.1038/415318a PMID: 11797008
41. Leutgeb JK, Leutgeb S, Treves A, Meyer R, Barnes CA, McNaughton BL, et al. Progressive transforma-
tion of hippocampal neuronal representations in "morphed" environments. Neuron. 2005; 48(2):345–
58. Epub 2005/10/26. S0896-6273(05)00773-7 [pii] https://doi.org/10.1016/j.neuron.2005.09.007
PMID: 16242413.
42. Abeles M, Bergman H, Gat I, Meilijson I, Seidemann E, Tishby N, et al. Cortical activity flips among
quasi-stationary states. Proc Natl Acad Sci U S A. 1995; 92(19):8616–20. https://doi.org/10.1073/pnas.
92.19.8616 PMID: 7567985
43. Seidemann E, Meilijson I, Abeles M, Bergman H, Vaadia E. Simultaneously recorded single units in the
frontal cortex go through sequences of discrete and stable states in monkeys performing a delayed
localization task. J Neurosci. 1996; 16(2):752–68. https://doi.org/10.1523/JNEUROSCI.16-02-00752.
1996 PMID: 8551358.
44. Rainer G, Miller EK. Neural ensemble states in prefrontal cortex identified using a hidden Markov model
with a modified EM algorithm. Neurocomputing. 2000; 32:961–6. https://doi.org/10.1016/S0925-2312
(00)00266-6 WOS:000087897800127.
45. Wang S, Shi Y, Li BM. Neural representation of cost-benefit selections in rat anterior cingulate cortex in
self-paced decision making. Neurobiol Learn Mem. 2017; 139:1–10. Epub 2016/12/07. https://doi.org/
10.1016/j.nlm.2016.12.003 PMID: 27919831.
46. Riveros ME, Forray MI, Torrealba F, Valdes JL. Effort Displayed During Appetitive Phase of Feeding
Behavior Requires Infralimbic Cortex Activity and Histamine H1 Receptor Signaling. Front Neurosci.
2019; 13:577. Epub 2019/07/19. https://doi.org/10.3389/fnins.2019.00577 PMID: 31316329; PubMed
Central PMCID: PMC6611215.
47. Nonacs P. State dependent behavior and the Marginal Value Theorem. Behavioral Ecology. 2001; 12
(1):71–83.
48. Grigson PS, Twining RC. Cocaine-induced suppression of saccharin intake: a model of drug-induced
devaluation of natural rewards. Behav Neurosci. 2002; 116(2):321–33. Epub 2002/05/09. PMID:
11996317.
49. Schroy PL, Wheeler RA, Davidson C, Scalera G, Twining RC, Grigson PS. Role of gustatory thalamus
in anticipation and comparison of rewards over time in rats. Am J Physiol Regul Integr Comp Physiol.
2005; 288(4):R966–80. Epub 2004/12/14. https://doi.org/10.1152/ajpregu.00292.2004 PMID:
15591157.
50. Grigson PS. Reward Comparison: The Achilles’ heel and hope for addiction. Drug Discov Today Dis
Models. 2008; 5(4):227–33. Epub 2008/01/01. https://doi.org/10.1016/j.ddmod.2009.03.005 PMID:
20016772; PubMed Central PMCID: PMC2794208.
51. Grigson PS, Freet CS. The suppressive effects of sucrose and cocaine, but not lithium chloride, are
greater in Lewis than in Fischer rats: evidence for the reward comparison hypothesis. Behav Neurosci.
2000; 114(2):353–63. https://doi.org/10.1037//0735-7044.114.2.353 PMID: 10832796.
52. Grigson PS. Conditioned taste aversions and drugs of abuse: a reinterpretation. Behavioral Neurosci-
ence. 1997; 111(1):129–36. PMID: 9109631
53. Fourcaud-Trocme N, Hansel D, van Vreeswijk C, Brunel N. How spike generation mechanisms deter-
mine the neuronal response to fluctuating inputs. J Neurosci. 2003; 23(37):11628–40. Epub 2003/12/
20. 23/37/11628 [pii]. https://doi.org/10.1523/JNEUROSCI.23-37-11628.2003 PMID: 14684865.
54. Abbott LF, Varela J, Sen K, Nelson S. Synaptic depression and cortical gain control. Science. 1997;
275(5297):221–4. https://doi.org/10.1126/science.275.5297.221 PMID: 8985017
55. Varela JA, Sen K, Gibson J, Fost J, Abbott L, Nelson SB. A quantitative description of short-term plastic-
ity at excitatory synapses in layer 2/3 of rat primary visual cortex. Journal of Neuroscience. 1997; 17
(20):7926–40. https://doi.org/10.1523/JNEUROSCI.17-20-07926.1997 PMID: 9315911
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 24 / 25
56. Tabak J, Senn W, O’Donovan MJ, Rinzel J. Modeling of spontaneous activity in developing spinal cord
using activity-dependent depression in an excitatory network. Journal of Neuroscience. 2000; 20
(8):3041–56. https://doi.org/10.1523/JNEUROSCI.20-08-03041.2000 PMID: 10751456
57. Kusick GF, Chin M, Raychaudhuri S, Lippmann K, Adula KP, Hujber EJ, et al. Synaptic vesicles tran-
siently dock to refill release sites. Nat Neurosci. 2020; 23(11):1329–38. Epub 2020/09/30. https://doi.
org/10.1038/s41593-020-00716-1 PMID: 32989294.
PLOS COMPUTATIONAL BIOLOGY
A model of naturalistic decision making in preference tests
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1009012 September 23, 2021 25 / 25
... No reuse allowed without permission. In our modeling study (Ksander et al., 2021) we found competition in the durations of an 381 activity state representing bout duration in response to alternating stimuli. The competition 382 arose from a slow synaptic depression in the model so we hypothesized that the competition 383 between successive stimuli would diminish over the timescale of recovery from that synaptic 384 depression. ...
... This was done for all bouts together 574 as well as for subsets of bouts depending on if they were 'early' or 'late' in a session or 575 following a stay or switch decision. 576 577 As one method of measuring the impact of current/alternative palatability on switch 578 probability, we performed logistic regression using MATLAB's fitglm function to predict a switch 579 Simulations were carried out using a recently published model (Ksander et al., 2021). In brief, 599 leaky integrate-and-fire neurons were designated excitatory or inhibitory and assigned either to 600 a group whose activity promoted a continuous decision of "Stay" at the current stimulus or a 601 group whose activity promoted a decision to "Leave" the stimulus. ...
Preprint
The decision of whether to continue with a current action or to stop and consider alternatives is ever present in the life of an animal. Such continuous-time decision making lies at the heart of food preference tests whose outcomes are typically quantified by a single variable, the total amount consumed. However, the dynamics that give rise to such a quantity in terms of durations of bouts of sampling at a stimulus before pauses, and the impact of alternative stimuli on those bout durations and subsequent actions following a pause, can contain a richness of behavior that is not captured in a single palatability measure. Here we carry out multiple analyses of these dynamics, with a particular focus on assessing how the hedonic value of one taste stimulus impacts the behavior of a rat sampling a second taste stimulus during a preference test. We find evidence for an explicit competitive interaction between bout durations, such that the more palatable a stimulus the longer the bout durations when the rat samples the stimulus and the shorter the bout durations at the alternative. Such competition is reproduced in a model of a neural circuit that could underlie the continuous decision of when to end a sampling bout. We find that the competitive impact on bout durations is relatively short-lived whereas a competitive impact on the choice of which stimulus to approach following a pause persists. Such a discrepancy in the timescales for the decay of the impact of the alternative stimulus suggests different neural processes are involved in the choice of which stimulus to approach versus the choice of how long to sample from it. Since these two choices together combine to determine net consumption and therefore the inferred palatability or preference of a gustatory stimulus, our results suggest that palatability is not a unitary quantity but the result of at least two distinct, context-dependent neural processes.
Article
Full-text available
Synaptic vesicles fuse with the plasma membrane to release neurotransmitter following an action potential, after which new vesicles must ‘dock’ to refill vacated release sites. To capture synaptic vesicle exocytosis at cultured mouse hippocampal synapses, we induced single action potentials by electrical field stimulation, then subjected neurons to high-pressure freezing to examine their morphology by electron microscopy. During synchronous release, multiple vesicles can fuse at a single active zone. Fusions during synchronous release are distributed throughout the active zone, whereas fusions during asynchronous release are biased toward the center of the active zone. After stimulation, the total number of docked vesicles across all synapses decreases by ~40%. Within 14 ms, new vesicles are recruited and fully replenish the docked pool, but this docking is transient and they either undock or fuse within 100 ms. These results demonstrate that the recruitment of synaptic vesicles to release sites is rapid and reversible.
Article
Full-text available
Individual neurons in many cortical regions have been found to encode specific, identifiable features of the environment or body that pertain to the function of the region1,2,3. However, in frontal cortex, which is involved in cognition, neural responses display baffling complexity, carrying seemingly disordered mixtures of sensory, motor and other task-related variables4,5,6,7,8,9,10,11,12,13. This complexity has led to the suggestion that representations in individual frontal neurons are randomly mixed and can only be understood at the neural population level14,15. Here we show that neural activity in rat orbitofrontal cortex (OFC) is instead highly structured: single neuron activity co-varies with individual variables in computational models that explain choice behaviour. To characterize neural responses across a large behavioural space, we trained rats on a behavioural task that combines perceptual and value-guided decisions. An unbiased, model-free clustering analysis identified distinct groups of OFC neurons, each with a particular response profile in task-variable space. Applying a simple model of choice behaviour to these categorical response profiles revealed that each profile quantitatively corresponds to a specific decision variable, such as decision confidence. Additionally, we demonstrate that a connectivity-defined cell type, orbitofrontal neurons projecting to the striatum, carries a selective and temporally sustained representation of a single decision variable: integrated value. We propose that neurons in frontal cortex, as in other cortical regions, form a sparse and overcomplete representation of features relevant to the region’s function, and that they distribute this information selectively to downstream regions to support behaviour.
Article
Full-text available
The chances to succeed in goal-directed behaviors, such as food or water-seeking, improve when the subject is in an increased arousal state. The appetitive phase of these motivated behaviors is characterized by high levels of behavioral and vegetative excitation. The key decision of engaging in those particular behaviors depends primarily on prefrontal cortical areas, such as the ventromedial prefrontal cortex. We propose that the infralimbic cortex (ILC) located in the medial prefrontal cortex induces an increase in arousal during the appetitive phase of motivated behavior, and that this increase in arousal is, in turn, mediated by the activation of the brain histaminergic system, resulting in higher motivation for getting food rewards. To test this hypothesis, we conduct a progressive ratio operant conditioning to test the degree of motivation for food, while simultaneously manipulating the histaminergic system through pharmacologic interventions. We found that the behavioral responses to obtain food in hungry rats were disrupted when the ILC was inhibited through muscimol infusion, blocking brain H1 histamine receptors by intracerebroventricular infusion of pyrilamine or by satiety. In contrast, the consummatory behavior was not affected by ILC inhibition. The extracellular histamine levels in the ILC were increased in direct correlation with the degree of motivation measured in the progressive ratio test. ILC inhibition also prevented this increase in histamine levels. The rise in extracellular histamine levels during the progressive ratio test was similar (ca. 200%) during the active or the resting period of the day. However, different basal levels are observed for these two periods. Our findings suggest that increased histamine levels during this behavior are not simply explained by the awaked state, but instead, there is a motivation-related release of histamine, suggestive of a specific form of brain activation. Serotonin (another critical component of the ascending arousal system) was also tested. Interestingly, changes in levels of this neuromodulator were not detected during the progressive ratio test. In conclusion, our results suggest that ILC activation and subsequent increase in brain histamine release are both necessary for the normal performance of a motivated behavior such as feeding.
Article
Full-text available
Converging evidence from neural, perceptual and simulated data suggests that discrete attractor states form within neural circuits through learning and development. External stimuli may bias neural activity to one attractor state or cause activity to transition between several discrete states. Evidence for such transitions, whose timing can vary across trials, is best accrued through analyses that avoid any trial-averaging of data. One such method, hidden Markov modeling, has been effective in this context, revealing state transitions in many neural circuits during many tasks. Concurrently, modeling efforts have revealed computational benefits of stimulus processing via transitions between attractor states. This review describes the current state of the field, with comments on how its perceived limitations have been addressed.
Article
Full-text available
Whereas many laboratory-studied decisions involve a highly trained animal identifying an ambiguous stimulus, many naturalistic decisions do not. Consumption decisions, for instance, involve determining whether to eject or consume an already identified stimulus in the mouth and are decisions that can be made without training. By standard analyses, rodent cortical single-neuron taste responses come to predict such consumption decisions across the 500 ms preceding the consumption or rejection itself; decision-related firing emerges well after stimulus identification. Analyzing single-trial ensemble activity using hidden Markov models, we show these decision-related cortical responses to be part of a reliable sequence of states (each defined by the firing rates within the ensemble) separated by brief state-to-state transitions, the latencies of which vary widely between trials.Whenwe aligned data to the onset of the (late-appearing) state that dominates during the time period in which single-neuron firing is corr lated to taste palatability, the apparent ramp in stimulusaligned choice-related firing was shown to be amuchmore precipitous coherent jump. This jump in choice-related firing resembled a step function more than it did the output of a standard (ramping) decision-making model, and provided a robust prediction of decision latency in single trials. Together, these results demonstrate that activity related to naturalistic consumption decisions emerges nearly instantaneously in cortical ensembles.
Article
We recorded the activity of single neurons in the posterior parietal cortex (area LIP) of two rhesus monkeys while they discriminated the direction of motion in random-dot visual stimuli. The visual task was similar to a motion discrimination task that has been used in previous investigations of motion-sensitive regions of the extrastriate cortex. The monkeys were trained to decide whether the direction of motion was toward one of two choice targets that appeared on either side of the random-dot stimulus. At the end of the trial, the monkeys reported their direction judgment by making an eye movement to the appropriate target. We studied neurons in LIP that exhibited spatially selective persistent activity during delayed saccadic eye movement tasks. These neurons are thought to carry high-level signals appropriate for identifying salient visual targets and for guiding saccadic eye movements. We arranged the motion discrimination task so that one of the choice targets was in the LIP neuron's response field (RF) while the other target was positioned well away from the RF. During motion viewing, neurons in LIP altered their firing rate in a manner that predicted the saccadic eye movement that the monkey would make at the end of the trial. The activity thus predicted the monkey's judgment of motion direction. This predictive activity began early in the motion-viewing period and became increasingly reliable as the monkey viewed the random-dot motion. The neural activity predicted the monkey's direction judgment on both easy and difficult trials (strong and weak motion), whether or not the judgment was correct. In addition, the timing and magnitude of the response was affected by the strength of the motion signal in the stimulus. When the direction of motion was toward the RF, stronger motion led to larger neural responses earlier in the motion-viewing period. When motion was away from the RF, stronger motion led to greater suppression of ongoing activity. Thus the activity of single neurons in area LIP reflects both the direction of an impending gaze shift and the quality of the sensory information that instructs such a response. The time course of the neural response suggests that LIP accumulates sensory signals relevant to the selection of a target for an eye movement.
Article
The anterior cingulate cortex (ACC) is crucial for decision making which involves the processing of cost–benefit information. Our previous study has shown that ACC is essential for self-paced decision making. However, it is unclear how ACC neurons represent cost–benefit selections during the decision-making process. In the present study, we trained rats on the same “Do More Get More” (DMGM) task as in our previous work. In each trial, the animals stand upright and perform a sustained nosepoke of their own will to earn a water reward, with the amount of reward positively correlated to the duration of the nosepoke (i.e., longer nosepokes earn larger rewards). We then recorded ACC neuronal activity on well-trained rats while they were performing the DMGM task. Our results show that (1) approximately 3/5 ACC neurons (296/496, 59.7%) exhibited changes in firing frequency that were temporally locked with the main events of the DMGM task; (2) about 1/5 ACC neurons (101/496, 20.4%) or 1/3 of the event-modulated neurons (101/296, 34.1%) showed differential firing rate changes for different cost–benefit selections; and (3) many ACC neurons exhibited linear encoding of the cost–benefit selections in the DMGM task events. These results suggest that ACC neurons are engaged in encoding cost–benefit information, thus represent the selections in self-paced decision making.
Article
Rats suppress intake of a saccharin conditioned stimulus (CS) when it is paired with an aversive unconditioned stimulus (US), an appetitive US, or a drug of abuse such as morphine or cocaine. It is unclear, however, whether the reduction in intake induced by these drugs is mediated by their aversive or their rewarding properties. The present set of experiments addressed this question by comparing the suppressive effects of a known aversive US (LiCl), a known reinforcing US (sucrose), and a drug of abuse (cocaine) in two strains of rats (i.e., Lewis and Fischer 344 rats) that differ in their preference for rewarding stimuli. The results show that, although both strains readily acquired a LiCl-induced conditioned taste aversion (CTA), the suppressive effects of sucrose and cocaine were robust in the drug-preferring Lewis rats and absent in the Fischer rats. These data argue against a CTA account and in favor of the reward comparison hypothesis.
Article
There is growing interest in diffusion models to represent the cognitive and neural processes of speeded decision making. Sequential-sampling models like the diffusion model have a long history in psychology. They view decision making as a process of noisy accumulation of evidence from a stimulus. The standard model assumes that evidence accumulates at a constant rate during the second or two it takes to make a decision. This process can be linked to the behaviors of populations of neurons and to theories of optimality. Diffusion models have been used successfully in a range of cognitive tasks and as psychometric tools in clinical research to examine individual differences. In this review, we relate the models to both earlier and more recent research in psychology.