Content uploaded by John W. Donahoe
Author content
All content in this area was uploaded by John W. Donahoe on Jul 19, 2017
Content may be subject to copyright.
1
Behavior Analysis and Neuroscience: Complementary Disciplines
John W. Donahoe
University of Massachusetts/Amherst
Amherst, MA 01002
Abstract
Behavior analysis and neuroscience are disciplines in their own right but are united in that both
are subfields of a common overarching field—biology. What most fundamentally unites these
disciplines is a shared commitment to selectionism, the Darwinian mode of explanation. In selec-
tionism, the order and complexity observed in nature are seen as the cumulative products of se-
lection processes acting over time on a population of variants—favoring some and disfavoring
others—with the affected variants contributing to the population on which future selections oper-
ate. In the case of behavior analysis, the central selection process is selection by reinforcement;
in neuroscience it is natural selection. The two selection processes are inter-related in that selec-
tion by reinforcement is itself the product of natural selection. The present paper illustrates the
complementary nature of behavior analysis and neuroscience through considering their joint con-
tributions to three central problem areas: reinforcement—including conditioned reinforcement,
stimulus control—including equivalence classes, and memory—including reminding and re-
membering.
This article illustrates the complementary
nature of behavior analysis and neuroscience,
each a scientific discipline in its own right, but
each informed and enriched by the other. The un-
derlying premise is that the advantages of the in-
tegration of behavior analysis and neuroscience
are foreshadowed by the earlier benefits of inte-
grating the sciences of evolution and heredity,
now known as the Modern Synthesis (Dobzhan-
sky, 1937, cf. Donahoe, 2003). Of this earlier syn-
thesis, the biologist and philosopher of science
Paul Gayon wrote, “If there is a key event in the
history of Darwinism, it must be its confrontation
with Mendelism. It was through its contact with
the new science of heredity that the theory of se-
lection became truly intelligible” (Gayon, 1992,
p. 253). As it was with evolution through natural
selection and genetics, so it may well be with se-
lection by reinforcement and its neural mecha-
nisms. Efforts to integrate findings from behav-
ior analysis and neuroscience are nothing new, of
course. Edward Thorndike’s earliest writings em-
phasized the importance of not only the Law of
Effect but also of what he termed the “Law of Ac-
quired Brain Connections” (Thorndike, 1903, p.
165; cf. Donahoe, 1999), although he lamented
the absence of direct experimental knowledge of
the latter. Donald Hebb was perhaps the first
American psychologist to attempt a wide-ranging
integration of the two disciplines in his Organi-
zation of Behavior (1949), but again direct
knowledge of the neural processes remained
In press, Journal of the Experimental Analysis of Behavior. Consult Journal for pagination and bibliographic citation.
2
largely absent. Hebb’s proposal that the connec-
tions between two neurons increased in strength
when they were co-active was simply a restate-
ment in neural terms of the behaviorally estab-
lished contiguity requirement. Skinner regarded
behavior analysis as “a rigorous, extensive, and
rapidly advancing branch of biology” (1974, p.
255) but noted two prerequisites for such an inte-
gration. Of behavior: “A rigorous description at
the level of behavior is necessary for the demon-
stration of a neurological correlate.” And, of neu-
roscience: “A science of the nervous system will
someday start from the direct observation of neu-
ral processes … It is with such a science that the
neurological point of view must be concerned if
it is to offer a convincing ‘explanation’ (sic) of
behavior” (Skinner, 1938, p. 422). These two pre-
requisites are becoming increasingly approxi-
mated, if not satisfied, by the present state of de-
velopment of each discipline.
Before describing specific examples of
the salutary effects of integrating behavior analy-
sis and neuroscience, two general orienting prin-
ciples are identified. First, both evolutionary bi-
ology and behavior analysis are selectionist sci-
ences (Catania, 1987; Donahoe, 2003; Palmer &
Donahoe, 1992; Staddon, 1983; McDowell,
2013a). That is, the diverse and potentially com-
plex outcomes of evolution and reinforcement are
the cumulative products of relatively simple se-
lection processes acting over time. Any given in-
stance of selection occurs under conditions that
exist at that moment, conditions which may or
may not be shared with future moments. Strictly
speaking, selection processes “prepare” the or-
ganism for conditions that existed in the past, not
in the future except in so far as the future resem-
bles the past. Second, the cumulative products of
selection processes are the net effects of many in-
dividual instances of selection, whether reproduc-
tive fitness acting on a population of different or-
ganisms in the case of evolution or reinforcement
acting on a population of different environment-
behavior relations of a single organism in the case
of behavior (Donahoe, 2003). The nature of the
population on which the selecting events act—
different organisms in the case of evolution or
different behaviors of the same organism in the
case of behavior analysis—is what most funda-
mentally distinguishes much of psychology from
behavior analysis. Psychology often seeks its
principles through averaging steady-state obser-
vations from different organisms, a method ap-
propriate for population genetics but not neces-
sarily for a science of behavior (e.g., Estes, 1956;
Sidman, 1960).
In subsequent sections, the relevant be-
havioral findings are first briefly summarized fol-
lowed by the neural mechanisms that mediate that
behavior. Skinner’s admonition that a “rigorous
description at the level of behavior” must precede
“the demonstration of a neurological correlate” is
amply illustrated. The first findings considered
are those related to the reinforcement process it-
self. Just as natural selection informs all of evo-
lutionary biology, so a properly formulated con-
ception of selection by reinforcement should illu-
minate the full range of behavior from simple
conditioning to complex human behavior. The
discussion of selection by reinforcement, includ-
ing conditioned reinforcement, is followed by
sections on stimulus control—including equiva-
lence classes, and memory.
1
Selection by Reinforcement
The approach to reinforcement described
here is a logical extension of Skinner’s view that
selection occurs at the moment when a reinforc-
ing stimulus (that is, a reinforcer) appears in close
proximity to a stimulus in the Pavlovian proce-
dure or to a response in the operant procedure.
Figure 1 depicts these two temporal relations.
Note, however, that in the Pavlovian procedure
some response necessarily occurs in proximity to
the reinforcer whereas in the operant procedure
some stimulus necessarily occurs in proximity to
the reinforcer. Thus, what truly distinguishes be-
tween the two procedures is not whether a stimu-
lus or a response precedes the reinforcer, but the
reliability with which that event precedes the re-
inforcer. The classical and operant procedures
clearly differ, but whether the procedural differ-
ence portends a fundamental difference in the
conditioning process is a separate matter.
1 The body of the article is self-contained, but supplementary technical information is occasionally
introduced by means of endnotes.
2
Figure 1. Diagram illustrating the similarity and difference between the Pavlovian and operant
procedures. In both procedures, a reinforcing stimulus (SR) is introduced that elicits a response (Relicited).
In the Pavlovian procedure SR most frequently follows a specific stimulus whereas in the operant procedure
SR most frequently follows a specific response (Rj) . Relicited typically goes unmeasured in the operant pro-
cedure. Note, however, that stimuli and responses necessarily precede the reinforcer in both procedures.
Skinner implicitly appreciated this point
in his demonstrations of superstitious condition-
ing. Superstitions of the first kind occurred when
the frequency of a response changed if it occurred
by chance before a reinforcing stimulus (Skinner,
1948; cf. Staddon & Simmelhag, 1971). Supersti-
tions of the second kind occurred when a rein-
forced response occurred by chance during a
stimulus and the response then came under the
control of that stimulus (Morse & Skinner, 1957).
Viewed in this way, the distinction between Pav-
lovian and operant conditioning is procedural,
and not necessarily a difference in the condition-
ing process itself. Skinner was open to this possi-
bility. However, he chose not to pursue the com-
monalities in the conditioning process in the two
procedures but, instead, to explore the profound
differences in the consequences of the procedural
difference. In the Pavlovian procedures, an elic-
ited response (the respondent) comes under the
control of arbitrary stimuli
2
whereas in the oper-
ant procedure the entire behavioral repertoire of
the organism potentially comes under the control
of arbitrary stimuli, thereby opening the path to
the emergence of complex behavior. Skinner’s
recognition of the possible unity of the selection
process was acknowledged without dissent in his
very early writings (Skinner, 1938, p. 111): “Hil-
gard (1937) ... points out that ... reinforcement is
essentially the same process in both [proce-
dures]” and “Mowrer (1937) holds out that ... the
two processes may eventually be reduced to a sin-
gle formula.”
Behavioral Findings
Contiguity. The first variable identified
as critical to selection by reinforcement was the
temporal relation between the reinforcer and the
environment (the conditioned stimulus, or CS) in
the Pavlovian procedure or the behavioral event
(the operant) in the operant procedure. In the Pav-
lovian procedure, an effective interval is typically
described as no more than a few seconds (e.g.,
Smith, Coleman, & Gormezano, 1969). The ef-
fect of varying the interval between the operant
and the reinforcer is more difficult to analyze ex-
perimentally because the onset of the operant
cannot be controlled and the events that intervene
between the operant and the reinforcer may in-
clude other instances of the operant, as in inter-
mittent schedules of reinforcement. In addition,
stimuli that function as conditioned reinforcers
may bridge the gap between the operant and the
reinforcer (e.g., Catania, 1971; Lattal & Gleeson,
1990). In his initial operant procedure, Skinner
sought to minimize these intrusions by immedi-
ately following the operant by the “click” of a
feeder, an auditory stimulus that had previously
immediately preceded the delivery of food ac-
cording to a Pavlovian procedure (Skinner, 1938
p. 53). A commitment to temporal contiguity was
also expressed in Ferster and Skinner’s treatment
of schedules of reinforcement: “The only contact
between [the schedule] and the organism occurs
2
at the moment (emphasis added) of reinforce-
ment.... Under a given schedule of reinforcement,
it can be shown that at the moment of reinforce-
ment a given set of stimuli will usually prevail. A
schedule is simply a convenient way of arranging
this. (Ferster & Skinner, 1957, pp. 2-3).
The general conclusion is that a reinforc-
ing stimulus produces changes in stimulus control
in the Pavlovian procedure and changes in re-
sponse frequency in the operant procedure only
over relatively brief intervals of time.
3
As Skin-
ner put it, “To say that a reinforcement is contin-
gent upon a response may mean nothing more
than that it follows the response” (Skinner, 1948,
p. 168).
Behavioral discrepancy. With the ad-
vent of the blocking procedure, it became clear
that, although contiguity was necessary, it was
not sufficient to produce selection by reinforce-
ment. As demonstrated initially with a Pavlovian
procedure (Kamin, 1968; 1969) and subsequently
with an operant procedure (vom Saal & Jenkins,
1970), something in addition to temporal contigu-
ity was required.
4
Studies showed that if a stimu-
lus or a response was followed by a putative rein-
forcer with a temporal relation known to be effec-
tive for conditioning, but that the stimulus or re-
sponse was accompanied by another stimulus that
had previously preceded the same reinforcer, then
the putative reinforcer would not be effective.
Prior conditioning blocked the acquisition of a
new environment-behavior (E-B) relation if the
same reinforcer had previously occurred in that
context. Blocking has also been shown to occur
with conditioned reinforcers. Using a concurrent
schedule in which one response produced a visual
stimulus that had previously been paired with a
reinforcer and a second response produced a dif-
ferent visual stimulus that had been paired
equally often with the same reinforcer but only in
the presence of a previously conditioned stimu-
lus, responding was maintained for only the first
response (Palmer, 1986).
What is the nature of the additional vari-
able? Associationist psychology proposed that a
given reinforcer would support only a fixed max-
imum association value and that, because prior
conditioning had already attained that maximum
value, no further increases in associative strength
were possible (Rescorla & Wagner, 1972). The
formalization of this proposal in the Rescorla-
Wagner model has proven exceptionally fruitful
(Siegel & Allan, 1996, cf. Papini & Bitterman,
1990), especially when implemented in real-time
computer simulations (e.g., Sutton & Barto,
1981; Donahoe et al, 1993). The model is silent,
however, with respect to the biobehavioral mech-
anisms that implement it (Donahoe, 1984). The
physical events that are available to an organism
on its first exposure to a reinforcing stimulus are
the stimulus itself and the responses that are elic-
ited by that stimulus. Subsequent research with
the blocking procedure revealed that when the re-
inforcing stimulus remained constant but,
through various means, the response elicited by
the reinforcer was changed, conditioning oc-
curred to a stimulus that would have otherwise
been blocked (Brandon, Betts, & Wagner, 1994;
Stickney & Donahoe, 1983). In short, blocking
was prevented when the unconditioned response
(UR) was changed. Based on these findings, the
second requirement for selection by reinforce-
ment may be described as follows: The reinforc-
ing stimulus must not only be contiguous but it
must also evoke a change in the behavior that
would otherwise occur at that moment (Donahoe,
Crowley, Millard, & Stickney, 1982; Donahoe,
Burgos, & Palmer, 1993). This is known as the
behavioral-discrepancy requirement.
The importance of the behavior evoked
by the reinforcing stimulus was further confirmed
by other work demonstrating that the critical tem-
poral relation in the Pavlovian procedure was not
between the CS and the US as conventionally un-
derstood, but between the CS and the behavior
evoked by that stimulus, that is, the UR (Donahoe
& Vegas, 2004). Using the throat-movement
preparation of the pigeon, water or air injected
into the oral cavity of a restrained pigeon evokes
a UR of a sufficiently long latency (approxi-
mately 200 ms) and duration (approximately 2 s)
to permit the temporal relation of the CS to the
US and to the UR to be unconfounded. It was
found that whether the CS occurred before the
US-UR (the standard forward-conditioning pro-
cedure), only between the CS and the UR, or only
after the onsets of both the CS and UR—but dur-
ing the UR, the level of conditioned responding
to the CS was the same when measured on CS-
alone test trials. (Appropriate controls were insti-
tuted to rule out alternative interpretations such as
sensitization.) Thus the CS-UR relation, not the
3
CS-US relation, is the critical temporal relation
for conditioning in the classical procedure. The
apparent importance of the CS-US relation was
based largely on procedures in which the US elic-
ited a short latency, brief duration UR, thereby
confounding the CS-US and CS-UR temporal re-
lations. These findings provide an interpretation
of a number of otherwise problematic findings
(Donahoe & Vegas,, 2004). Examples include:
(a) how the same CS-US temporal relation condi-
tions some responses but not others when the URs
differ in latency or duration (e.g., Betts, Brandon,
& Wagner, 1996), (b) how a backward condition-
ing arrangement (US-CS) conditions long-dura-
tion URs, such as autonomic responses, but not
short-duration URs, such as eye blinks (e.g., Al-
bert & Ayres, 1997; McNish, Betts, Brandon, &
Wagner, 1997), and (c) how a backward US-CS
arrangement conditions components of a tempo-
rally extended sequence of responses that are
contiguous with the CS (e.g., Silva, Timberlake,
& Cevik, 1998). The behavioral-discrepancy re-
quirement is also consistent with the a priori con-
jecture that the ancestral selecting environment
(that is, natural selection) would more likely pro-
duce a conditioning process that was sensitive to
the expressed behavior of an organism than to its
mere perception of a stimulus. Thomas Huxley’s
words resonate: “The great end of life is not
thought but action.”
5
The view that the same con-
ditioning process, requiring both contiguity and
discrepancy, is engaged by the Pavlovian and op-
erant procedures is summarized by the unified re-
inforcement principle (Donahoe et al, 1993).
Neuroscientific Findings
The present discussion of neural mecha-
nisms focuses on the neural systems that imple-
ment the reinforcement process (cf. Kirk, Babtie,
& Stumpf, 2015). As a result, the emphasis is
generally upon the net effects of the neuroana-
tomical and cellular components (neurons) of
these systems and not the intra-cellular processes
themselves. A neural system refers to those brain
structures and their interconnections that are most
relevant to the behavior under consideration. The
neural system of concern here implements rein-
forcement in the operant-conditioning procedure.
Figure 2 is a diagram of a lateral view of the hu-
man cerebral cortex showing (with dashed lines)
the subcortical structures and interconnections
that are central to implementing reinforcement.
The general flow of neural activity begins with
stimulation ofthe senses, such as vision and audi-
tion. This stimulation activates neurons in the pri-
mary sensory cortices subsequent to interactions
involving various subcortical structures (not
shown). Activity in primary sensory neurons is
propagated to sensory-association cortex whose
polymodal neurons respond to various combina-
tions of their inputs from neurons in primary sen-
sory and sensory-association cortices. Sensory
and polymodal neurons activate, in turn, neurons
in prefrontal and motor cortex (among others),
with the latter leading most directly to behavior
via the striatum and various other subcortical and
spinal structures (not shown). Neurons in pre-
frontal cortex, which includes the adjacent sup-
plementary motor and premotor areas, coordinate
neural activity in the motor cortex. This account
of the neural systems mediating E-B relations is
simplified and does not reflect other, often recip-
rocal, pathways between structures—some of
which are considered in a later section. The ac-
count is suitable for present purposes however.
2
.
Figure 2. Lateral view of the human cerebral cortex showing the major cortical regions and, using
dashed lines, the subcortical structures—nucleus accumbens (NAC) and ventral tegmental area (VTA)—
and pathways central to the neural mechanisms of selection by reinforcement. (The divisions of the cortex
designated in the diagram are for heuristic purposes only. For example, the region designated primary
sensory cortex is largely the primary visual cortex and the region designated sensory-association cortex
also includes the primary auditory cortex of the temporal lobe.)
Given the many neurons in the human brain—
perhaps 1012 , the many potential connections to a
single neuron—perhaps 103 in the case of motor
neurons, and the “spontaneous” activity of many
cortical neurons, natural selection faced a formi-
dable challenge: What mechanism could be de-
vised whereby the strengths of connections be-
tween neurons mediating a specific reinforced E-
B relation were strengthened? As with any selec-
tion process, the challenge cannot be not met on
a single occasion and is fallible (Donahoe, 2003;
Palmer & Donahoe, 1992). (This challenge is
known as the “assignment-of-credit” problem in
neural-network research.) The mechanism that
evolved has the effect of disproportionately, but
not exclusively, strengthening the “right” connec-
tions. (Personification of selection processes—
whether natural selection or selection by rein-
forcement—is an expository device only, of
course.) What is the mechanism that permits se-
lection by reinforcement of a wide range of spe-
cific E-B relations? When a response is followed
by a reinforcer it must be true that pathways suf-
ficient to enable the reinforced E-B relation are
among the many that are active at that moment.
However, the neural mechanisms implementing
the reinforcement process must not only affect
the pathways of that specific reinforced E-B rela-
tion but also potentially a wide range of other
pathways that might mediate other E-B relations
under different contingencies of reinforcement.
Such a mechanism is required if almost any stim-
ulus is to potentially control almost any operant.
Dopamine and the selection of path-
ways mediating reinforced E-B relations. The
neural capability for widespread effects of rein-
forcement is accomplished through the liberation
and subsequent diffusion of the neuromodulator
dopamine (DA) along projections from cells in
the ventral tegmental area (VTA) to the prefrontal
and motor cortices (among others). The diffusion
of DA allows a reinforcer to affect a wide range
of synapses and, thereby a wide range of potential
E-B relations mediated by the pathways involv-
ing those synapses.
6
See Figure 2. For purposes
of the current discussion, the neural inputs to the
prefrontal and motor cortices from sensory areas
are presumed to be sufficient to control the rein-
forced behavior. This matter is considered further
in a later section.
DA is generally designated a neuromod-
ulator rather than a neurotransmitter because it al-
ters the effect of other neurotransmitters on the
NAC
2
functioning of neurons. One may speculate that
the central role of DA in reinforcement should not
be completely surprising given that DA also plays
an important role in the digestive process where
it governs gut motility (Li, Schmauss, Cuenca,
Ratcliffe, & Gershon, 2006). This leads to the
conjecture that the digestion of nutrients and re-
inforcement were intertwined in evolutionary his-
tory. The neural activity initiated by receptors
stimulated by taste and smell and by sexual activ-
ity activate DA neurons via pathways traversing
various intermediate subcortical structures.
Drugs of abuse also typically activate DA neu-
rons (e.g., Volkow, Fowler, & Wang, 2007; Wise,
2002). The widely projecting outputs (axons) of
VTA neurons liberate DA that diffuses from var-
icosities along their lengths. DA molecules re-
main present for several seconds before being de-
graded (Yagishita, Hayashi-Takagi, Ellis-Davies,
Urakubo, Ishii, & Kasai1, 2014). The relatively
short-lived presence of DA is consistent with the
contiguity requirement. DA also plays a critical
role in the discrepancy requirement as described
shortly.
Figure 3 indicates the frequency of firing
of VTA-DA neurons in our fellow primate, the
crab-eating monkey. The task was a differential
conditioning procedure in which pressing one of
two levers was reinforced with apple juice de-
pending on which of two spatially separated
lights was illuminated (Schultz, Apicella, &
Ljungberg, 1993; Schultz, Dayan, & Montague,
1997). The top panel (A) shows the frequency of
firing of DA neurons when the reinforcing stimu-
lus (SR) was presented at the outset of condition-
ing. Note that the baseline level of activity of DA
neurons increased abruptly shortly after the rein-
forcing stimulus occurred. Transition to the
“bursting” mode is required for DA to be liber-
ated along the axons of VTA neurons (Grace &
Bunney, 1984; Grace, Floresco, Goto, & Lodge,
2007; Johnson, Seutin, & North, 1992). The mid-
dle panel (B) shows the frequency of firing of DA
neurons after conditioning when the discrimina-
tive stimulus (SD) was presented and a correct re-
sponse was followed by SR. The burst of firing
now occurred at the onset of SD, not SR. This is
consistent with the long-standing behavioral find-
ing that discriminative stimuli also acquire a con-
ditioned reinforcing function (Dinsmoor, 1950;
cf. Williams, 1994a, b). Although conditioned re-
inforcers may acquire additional functions, par-
ticularly in free-operant procedures (e.g., Shahan,
2010), neuroscience indicates that both uncondi-
tioned and conditioned reinforcers cause DA to
be liberated by VTA neurons. The route whereby
VTA neurons are activated by conditioned rein-
forcers differs however. Whereas unconditioned
reinforcers activate VTA neurons by pathways
originating from receptors such as those for taste
or sexual stimulation (e.g., Smith, Liu, & Vogt,
1996; Balfour, Yu, & Coolen, 2004), conditioned
reinforcers activate VTA neurons by a less direct
route involving the prefrontal cortex (e.g., Pears,
Parkinson, Hopewell, Everitt, & Roberts, 2003;
Wilson & Bowman, 2004)
As shown in Figure 2, the neural activity
initiated by SD stimulates neurons in prefrontal
cortex and pathways originating from prefrontal
neurons activate cells in the nucleus accumbens
(NAC) that, in turn, cause VTA-DA neurons to
be activated. Thus both conditioned and uncondi-
tioned reinforcers ultimately engage the same
VTA reinforcement system, but by different
routes.
7
For experienced organisms in which
many E-B relations have been previously ac-
quired, the environment offers many opportuni-
ties for engaging the neural mechanisms of con-
ditioned reinforcement. In this way, activities that
require temporally extended sequences of behav-
ior, such as problem solving, are maintained by a
relatively continuous stream of conditioned rein-
forcers, which increases as the target behavior is
more closely approximated (Donahoe & Palmer,
1993, p. 285 ff.).
8
Panel C of Figure 3 reveals an additional
finding that relates to the behavioral-discrepancy
requirement. The lower panel shows the activity
of DA neurons on a trial in which SD was pre-
sented, but SR was omitted. Note that DA activity
was actually inhibited shortly after the time when
the reinforcer was otherwise scheduled to occur.
9
If another stimulus had accompanied this SD in
a blocking design, then the reinforcer would not
have produced an increase in DA activity and
would not have become a discriminative stimu-
lus. The inhibition of VTA neurons following the
time at which the reinforcer previously occurred
is the neural basis of the discrepancy requirement
(Burgos & Donahoe, 2016; Donahoe, Burgos, &
3
Palmer, 1993; Waelti, Dickinson, & Schultz,
2001).
10
Figure 3. The frequency of firing of dopaminergic (DA) neurons in the ventral tegmental area (VTA) during
a differential operant-conditioning procedure. Panel A shows the DA response to the reinforcing stimulus
(SR) at the outset of conditioning. Panel B shows the DA response to the discriminative stimulus (SD) and
SR on a reinforced trial after conditioning. Panel C shows the DA response to SD on a trial after condi-
tioning in which the reinforcer was omitted. See the text for additional information, (Data from Schultz,
1997; Schultz, Apicella, & Ljungberg, 1993)
Dopamine and the cellular mecha-
nisms of selection (LTP). The liberation of DA
from VTA neurons by reinforcers during condi-
tioning comports well with what is known from
behavioral research about unconditioned rein-
forcement, conditioned reinforcement, and block-
ing. In addition, the diffusion of DA in prefrontal
and motor cortex (Yagishita, Hayashi-Takagi, El-
lis-Davies, Urakubo, Ishii, & Kasai1, 2014) al-
lows DA to simultaneously affect the functioning
of many neurons along the diverse pathways re-
quired to implement a wide range of reinforced
E-B relations. What is the cellular process that en-
ables this function?
The process that affects the strengths of
connections along pathways mediating E-B rela-
tions is long-term potentiation (LTP) (Bliss &
Lømø, 1973). Before describing LTP, several
technical terms are briefly reviewed. LTP refers
to an increased ability (potentiation) of receptors
on a neuron to be activated by a neurotransmitter
released from the axons of other neurons. Neuro-
transmitters diffuse across the synapse between
presynaptic and postsynaptic neurons. The neu-
rotransmitter released by presynaptic neurons
may either increase (excite) or decrease (inhibit)
the activity of postsynaptic neurons through its
effect on the receptors to which the neurotrans-
mitter binds on the postsynaptic neuron. Recep-
tors are typically located within the membrane of
the dendrites of postsynaptic neurons. Neurons
normally maintain a negative resting potential
across the cell membrane. This potential becomes
more positive when neurons are activated and
more negative when they are inhibited. The mem-
brane potential is governed by the net influx of
negative and positive ions across the cell mem-
brane
.
2
Figure 4. An axon terminal of a presynaptic neuron and a dendritic spine of a postsynaptic neuron.
The excitatory neurotransmitter glutamate (Glu) released by the presynaptic neuron binds to two types of
postsynaptic Glu receptors—the fast-acting AMPA receptor and the voltage-sensitive NMDA receptor.
When the postsynaptic neuron is sufficiently depolarized by the action of Glu, the NMDA receptor allows
calcium ions (Ca2+) to enter the cell. If dopamine (DA) is simultaneously bound to DA receptors, then a
series of intracellular processes (second messengers) migrate to genetic material in the cytoplasm and/or
nucleus of the postsynaptic cell stimulating protein synthesis. The synthesized proteins diffuse down the
dendrite where they act upon those AMPA receptors to which Glu was previously bound. This produces
long-lasting changes in those Glu receptors that make them more responsive to Glu on subsequent occa-
sions. See the text for additional information.
Figure 4 is a schematic representation of
an axon terminal of a presynaptic neuron and a
juxtaposed spine on a dendrite of a postsynaptic
neuron. Single neurons have many such axon ter-
minals and dendritic spines. Molecules of the
neurotransmitter glutamate (Glu) are released
through the cell membrane of the presynaptic
neuron and diffuse into the synapse. (Glu is the
primary excitatory neurotransmitter in the brain.)
Molecules of Glu then bind to two types of Glu
receptors on the postsynaptic neuron—AMPA re-
ceptors and NMDA receptors. (The abbreviations
stand for the Glu analogs, α-amino-3-hydroxy-5-
methyl-4-isoxazolepropionic acid and N-methyl-
D-aspartate, respectively, that selectively bind to
the two types of Glu receptors.) If Glu binds to
AMPA receptors, entry of positive ions into the
cell increases and the membrane potential of the
neuron becomes less negative. When the mem-
brane potential becomes less negative, even if not
enough to cause the postsynaptic neuron to “fire,”
those specific AMPA receptors acquire a molec-
ular “tag” that enhances their response to Glu for
perhaps an hour or so (Frey, 1997; Frey & Morris,
1997). This brief enhancement is called early
LTP. When the concerted action of AMPA recep-
tors depolarizes the postsynaptic neuron suffi-
ciently, additional effects occur: If Glu has also
bound to NMDA receptors—which permit cal-
cium ions (Ca2+) to enter the cell—and if DA is
present, then a series of intracellular events are
initiated that result in the synthesis of new pro-
teins (see Figure 4). These proteins—which re-
quire time to synthesize--diffuse down the den-
drite of the postsynaptic cell and produce a long-
lasting enhancement in the response to Glu to
only the previously tagged AMPA receptors. This
produces late LTP, which is a long-lasting change
in the ability of Glu to activate specific receptors
on the postsynaptic neuron.
11
In this way, DA
2
“forges” pathways through the frontal lobes from
the inputs arriving from sensory and sensory-as-
sociation cortex to motor neurons that mediate
observable behavior (Reynolds, Hyland, & Wick-
ens, 2001).
12
The finding that DA plays an espe-
cially important role in the acquisition of effortful
responses or long sequences of responses may re-
flect that such tasks requite the potentiation of re-
ceptors on large numbers of neurons (Salamone,
Correa, Nunes, Randall, & Pardo, 2013, cf. Geor-
gopoulos, Schwartz, & Kettner, 1986; Sanes &
Donoghue, 2000). DA also potentiates Glu-ergic
receptors on postsynaptic neurons in the NAC,
which subserve conditioned reinforcement. If DA
is not present when pre- and postsynaptic neurons
are coactive, then the enhanced response of
postsynaptic Glu receptors declines, producing
long-term depression (LTD). Note that at each of
the various levels of analysis—the behavioral, the
neural, and the cellular—the mode of explanation
follows the Darwinian model of selectionism:
That is, reinforcers select among E-B relations;
reinforcer-evoked DA selects among neural path-
ways that mediate the E-B relations, and DA-in-
stigated protein synthesis selects among tagged
post-synaptic receptors (Donahoe, 1997).
13
Stimulus Control
Behavioral Findings
The vast range of phenomena encompassed by
the field of stimulus control was foreshadowed
by Skinner’s early comment that “discriminative
stimuli are practically inevitable after condition-
ing” (Skinner, 1937, p. 273; cf. Donahoe,
Palmer, & Burgos, 1997). The effects of only
two sets of phenomena are considered here—
discrimination formation and equivalence clas-
ses. Other areas are omitted, notably concept
formation initiated by the pioneering work of
Richard Herrnstein (Herrnstein, Loveland, &
Cable, 1976) and continuing as an active area of
current research (e.g. Zentall, Wasserman, & Ur-
cuioli, 2013).
Discrimination formation. If a rein-
forcer occurs in an environment, whether in a
Pavlovian or an operant procedure (e.g., Gyn-
ther, 1957 and Guttman & Kalish, 1956, respec-
tively), then behavior comes under the control of
not only that environment but also other envi-
ronments to the extent that they share stimuli in
common. This is the so-called common-ele-
ments account of stimulus generalization formal-
ized by Skinner’s former student William Estes
(Estes, Burke, & Atkinson, 1957). If a differen-
tial conditioning procedure is instituted, then the
controlling stimuli are confined to those that are
specific to the environment in which the rein-
forcer occurred. Stimuli that are not shared with
the conditioning environment become the occa-
sion for whatever previously conditioned and
unconditioned responses are supported by that
environment. The behavioral effects of differen-
tial conditioning are most apparent with experi-
mental procedures in which different operants
are conditioned to the different environments.
For example, lever presses of two different dura-
tions were differentially reinforced in the pres-
ence of two light intensities and tests of stimulus
generalization were then conducted with various
intermediate light intensities (Crowley, 1979).
The result was that responding during generali-
zation tests with the other light intensities was
confined to the two previously reinforced re-
sponse durations, their proportions varying with
the proximity of the test intensity to the two
training intensities. Findings of similar import
have been obtained with a variety of other pro-
cedures (e.g., Migler, 1966; Migler & Millenson,
1969; Scheuerman, Wildemann, & Holland,
1978; Wildemann & Holland, 1972). The gen-
eral conclusion from this work is that new envi-
ronments do not control new behavior but new
mixtures of previously conditioned behavior
(Bickel, & Etzel, 1985; Donahoe & Wessells,
1980, pp. 194-196). Any “creativity” arises from
variation in the environment, not from within the
organism itself. As E-B relations accumulate,
the variation increases upon which future selec-
tions by reinforcement may act, with the emer-
gent possibility of ever more complex E-B rela-
tions. Equivalence classes. Equivalence clas-
ses (Sidman & Tailby, 1982) are typically studied
using matching-to-sample procedures with multi-
2
ple, arbitrary (symbolic) conditional discrimina-
tions. For example, conditional discriminations
using A-B and A-D are trained where the pairs of
letters stand for different sets of sample and com-
parison stimuli, respectively. After these condi-
tional discriminations have been acquired, unre-
inforced probe trials are occasionally introduced
using previously untrained combinations of sam-
ple and comparison stimuli, for example D-B. An
equivalence class is said to form if an organism
reliably responds to the comparison stimulus as-
sociated with a sample stimulus that was previ-
ously trained with a different—but correspond-
ing—comparison stimulus. Equivalence classes
meet the three criteria of mathematical equiva-
lence—identity (A = A), symmetry (if A = B,
then B = A), and transitivity (if A = B and B = C,
then A = C). In the preceding example of a B-D
probe trial, responding to the appropriate D com-
parison stimulus requires both symmetry (only A-
B was trained, not B-A) and transitivity (B-A-D).
Equivalence classes have important implications
for the interpretation of complex behavior be-
cause they demonstrate that the effects of rein-
forcement on the environmental guidance of be-
havior transcend the specific E-B relations that
were explicitly reinforced, even beyond those of
stimulus generalization. The controlling stimuli
in the training environment (for example, A) do
not have stimulus elements in common with those
of the test environment (for example, D).
Neuroscientific Findings
When the discriminative stimuli in con-
ditioning and generalization-test situations are
well defined in physical terms, the findings are
relatively straightforward. For example, rabbits
trained using a classical-conditioning procedure
to respond to a particular frequency of an auditory
stimulus were then tested with other frequencies.
Generalization tests included measurement of the
responses of neurons in primary auditory cortex
as well as behavioral responses. It was found that
the greater the overlap between the number of
cells firing to the test stimuli and those firing to
the training stimulus, the greater the behavioral
response (Thompson, 1965). That is, responding
occurred to the extent that the conditioning and
test environments activated neural elements in
common.
Figure 5. Lateral view of the human cerebral cortex showing the major cortical regions and sub-
cortical structures and pathways (indicated by dashed lines) that play an important role in the selection of
environment-environment relations.
But what of equivalence-class proce-
dures in which the discriminated stimuli in the conditioning environment share no readily iden-
tifiable physical similarity to those in the testing
2
environment? To address this question it is nec-
essary to consider additional neural systems, spe-
cifically the relation between sensory-association
(S-A) cortex and the hippocampus. Figure 5 de-
picts that relation. As previously noted, neurons
in S-A cortex receive their inputs from primary
sensory and other S-A neurons. Thus the activa-
tion of S-A neurons reflects the co-occurrence of
multiple environmental inputs. The axons of S-A
neurons give rise to a multi-synaptic pathway
(among others) that innervates the hippocampus,
a subcortical structure located beneath the tem-
poral lobes. After interactions involving other
neurons within the hippocampus, neurons in the
CA1 region of the hippocampus give rise to a
multi-synaptic pathway that projects back to S-A
cortex. (CA is an abbreviation for the Latin phase,
cornu Ammons, or horn of Ammons, that refers to
the shape of the hippocampus.) It has been pro-
posed that the output of the hippocampus from
the CA1 region projects back to the very regions
and/or neurons of the S-A cortex that gave rise to
the hippocampal inputs (Amaral, 1987). The syn-
chronized activation of S-A neurons by their sen-
sory inputs and, very shortly thereafter, by their
hippocampal inputs depolarizes those S-A neu-
rons and, in this way, potentiates specific recep-
tors on the neuron (cf. Strycker, 1986). These re-
ceptors are then tagged and potentially undergo
LTP.
14
The net effect of this process produces S-
A neurons that respond to their multi-modal in-
puts. As a simplistic example, if the inputs to a
given S-A neuron arise from the co-occurrence of
a tone and a light, then that S-A neuron becomes
functionally a tone-light neuron. The net result is
that S-A neurons become sensitive to the con-
junctions of spatio-temporally contiguous, envi-
ronmental events, or environment-environment
(E-E) relations.
The acquisition of E-E relations between
sample and comparison stimuli has been pro-
posed to underlie the formation of equivalence re-
lations (Donahoe & Palmer, 1993, p. 148). Re-
cent neuroscientific evidence supports this pro-
posal. Rhesus monkeys that had acquired an arbi-
trary matching-to-sample task using computer-
generated, abstract visual stimuli developed sin-
gle neurons in the S-A cortex that responded
equally to either the sample or the comparison
stimulus of pairs to which correct responding had
been reinforced. Such neurons were not found for
unreinforced pairs of stimuli, although they had
occurred together equally often (Fujimichi, Naya,
& Miyashita, 2005; Sakai & Miyashita, 1991).
Neurons that responded equally to either member
of a reinforced pair were designated “pair-coded
neurons.” For such neurons, the two stimuli were
literally equivalent. The effect of reinforcement
on the acquisition of E-E relations is facilitated
by DA-ergic projections from the VTA to CA1
neurons (see Figure 5), the output neurons of the
hippocampus to SA cortex (Duncan, Ketz, Inati,
& Davachi, 2012; Gasbarri, Verney, Innocenzi,
Campana, & Pacitti, 1994; Li, Cullen, Anwyl, &
Rown, 2003; Swanson & Kohler, 1986). As al-
ready noted, without the contribution of DA, re-
ceptors on CA1 neurons would be potentiated for
only a few hours (early LTP). With the contribu-
tion of DA, potentiation of CA1 and S-A neurons
may endure indefinitely until “over-written” by
other E-E relations.
With nonhuman organisms, behavioral
research provides clear evidence of both identity
(generalized identify matching) and transitivity,
but little or no compelling evidence of symmetry
(but see Kastak, Schusterman, & Kastak, 2001;
Urcuioli, 2008). In humans, strong evidence ex-
ists for all three components of equivalence clas-
ses (Lionello-DeNolf, 2009). What is responsible
for this apparent inter-species difference? Re-
search in neuroscience offers some clues. In pri-
mates, unlike other mammalian orders, the bidi-
rectional pathways that loop between the S-A
cortex and hippocampus arise exclusively from
and synapse upon S-A neurons (Amaral, 1987).
Organisms that have a lesser capacity for acquir-
ing pair-coded neurons—or lack homologous
neural structures that achieve the same end—may
have a diminished or absent capacity to form the
pair-coded neurons that support symmetric dis-
criminations (cf. Sidman, Rauzin, Lazar, Cun-
ningham, Tailby, & Carrigan, 2013). For organ-
isms whose neuroanatomy permits the formation
of pair-coded neurons, the development of neu-
rons that respond to multi-sensory inputs should
be the norm if the contingencies of reinforcement
foster their selection.15As Sidman proposed,
equivalence-class formation may be a normative
effect of reinforcement (Sidman, 2000).
3
Memory
Behavioral Findings
In a behaviorally informed approach to
memory, as the field is conventionally known,
phenomena may be divided into two cases—re-
minding and remembering (Donahoe & Palmer
1993; Palmer 1991). Reminding refers to those
cases in which some behavior is scheduled for re-
inforcement and the behavior takes place in an
environment in which that behavior has been pre-
viously reinforced. As an example, when asked
“Who was the President during the Civil War,”
the response “Lincoln” is scheduled for reinforce-
ment and has almost certainly been previously re-
inforced in that verbal context. This is simply a
straight-forward example of stimulus control. Us-
ing language more congenial to the field of
memory, the environment may be said to remind
the organism of the appropriate behavior. By con-
trast, remembering refers to cases in which some
behavior is scheduled for reinforcement but the
contemporaneous environment does not include a
sufficient number of the stimuli to which the tar-
get response was previously conditioned. Readily
observable behavior can sometimes produce the
requisite controlling stimuli as when one consults
a smartphone to find the telephone number of
AAA after one’s car has broken-down. Consult-
ing a smartphone to find a telephone number and
then keying in that number have both been previ-
ously reinforced. A greater challenge is presented
by those cases of remembering in which the be-
havior required to produce the stimuli needed to
control the appropriate response is covert. For ex-
ample, upon seeing the approach of casual ac-
quaintances, you struggle to recall their names.
You may then subvocally sound out the letters of
the alphabet hoping to hit upon letters that, to-
gether with the appearance of their faces, prompt
the correct names. The sight of the approaching
acquaintances controls covert behavior (mne-
monic behavior) that in the past has been effec-
tive in producing covert stimuli sufficient for the
emission of the overt response—in this case ut-
tering the person’s name (Donahoe & Palmer,
1993, pp. 275-277). In brief, remembering re-
quires engaging in behavior, whether overt or
covert, that produces stimuli that are sufficient to
remind oneself, so to speak, of the behavior
needed to produce the reinforcer. However, note
that when the reminding behavior is covert, this
is an interpretation of memory not an experi-
mental analysis (cf. Skinner, 1974, p. 194;
Palmer, 2011). That is, the account is an extrapo-
lation from behavioral observations that draws
upon only processes that have been analyzed ex-
perimentally in previous contexts and that are
highly likely to have occurred in the current con-
text.
Neuroscientific Findings.
The behavioral interpretation of remem-
bering requires that the behavior scheduled for re-
inforcement is already within the repertoire of the
organism, but that stimuli adequate to control that
behavior are not present. For example, the ques-
tion “Who was the Union General who won the
Battle of Gettysburg?” may not be sufficient to
control the verbal response “Meade.” However, a
response of the desired form can be emitted under
other circumstances, for example, as a textual re-
sponse to the printed word “Meade.” A biobehav-
ioral interpretation proposes that remembering is
accomplished at the neural level by exploiting the
extensive pathways (tracts) that course back and
forth between the S-A and prefrontal cortex (Do-
nahoe & Palmer, 1993; cf. Fuster, 2015). Re-
peated cycles of these neural interactions can po-
tentially activate motor neurons that produce the
target behavior.
2
Figure 6. Schematic view of the upper surface of the cerebral cortex showing the left and right
hemispheres. The right hemisphere is deflected back to reveal the corpus callosum (black), the large fiber
bundle that interconnects neurons in the two hemispheres. The curved block arrow indicates the tracts that
connect the sensory-association (S-A) area of the right hemisphere to the prefrontal cortex of that same
hemisphere. (Comparable tracts exist within the left hemisphere but are not shown.) The block arrows
indicate other tracts between the two hemispheres that course through the corpus callosum. The portion of
the corpus callosum that was first severed is indicated by A; the portion that was later severed is indicated
by B (Tomita , Ohbayashi, Nakahara, Hasegawa, & Miyashita, 1999). See the text for additional infor-
mation.
A behavioral interpretation of remember-
ing has received support from the experimental
analysis of neuroscience. To appreciate these
findings, additional neuroanatomical information
is required. Figure 6 is a schematic view looking
down on the upper surface of the cerebral cortex.
The right hemisphere has been deflected back to
reveal the large fiber bundle (corpus callosum,
shown in black) that interconnects neurons in the
two hemispheres. A monkey was trained with
multiple, abstract, conditional-discriminations
using a matching-to-sample task in which visual
stimuli were presented in such a way that sample
stimuli activated only neurons in the left visual
cortex. The corresponding two comparison stim-
uli activated only neurons in the right visual cor-
tex (Tomita, Ohbayashi, Nakahara, Hasegawa, &
Miyashita, 1999). In an intact brain, stimuli that
directly activate neurons in only one hemisphere
can also activate neurons in the contralateral
hemisphere by means of pathways (shown as an
inverted V-shaped, double-headed, block arrow)
coursing through the posterior portion of the cor-
pus callosum (denoted by A in Figure 6). After
the subjects had acquired the matching-to-sample
task to a high level of proficiency, the A portion
of the corpus callosum was severed. Despite the
absence of connections between the S-A cortices
of the two hemispheres, performance remained at
a high level. Performance was maintained by two
sets of alternate pathways: One pathway (indi-
cated by a curved block arrow) connected neu-
rons in the S-A cortex to neurons in the prefrontal
cortex of the right hemisphere. The second set of
pathways (shown as a single-headed, right-an-
gled, block arrow in Figure 6) connected neurons
in the prefrontal cortex of the right hemisphere to
neurons in the SA area of the left hemisphere via
the anterior portion of the corpus callosum (de-
noted B in Figure 6). In short, feedback from ac-
tivity in prefrontal cortex was sufficient to control
discriminative performance. This feedback is the
neural counterpart of mnemonic behavior pro-
2
posed in a behavioral interpretation of remember-
ing (Donahoe & Palmer, 1994): The test environ-
ment (the sample stimulus) was not sufficient to
control responding to the reinforced comparison
stimulus, but it instigated activity in the prefrontal
cortex that was sufficient to appropriately control
responding. When the anterior portion of the cor-
pus callosum was severed (indicated by B in Fig-
ure 6), performance returned to chance levels and
could not be recovered. The experiment demon-
strates that neural activity that does not evoke re-
sponses at the behavioral level of observation can
nevertheless control behavior when the contem-
porary environment fails to provide stimuli in
whose presence the response was previously re-
inforced. The behaviorally based distinction be-
tween reminding and remembering is thus con-
sistent with independent convergent evidence
from neuroscience where the two classes of
memory are known as bottom-up vs. top-down
processing, respectively (Fujimichi et al, 2005).
13 In humans, covert stimuli—such as those pro-
duced by subvocal verbal behavior that occurred
at the time of acquisition—can facilitate these
neural processes if they occur later during re-
membering, Verbal behavior—whether overt or
covert—is not necessary for remembering, how-
ever, as shown by the aforementioned findings
with monkeys.
Conclusions
These findings and their implications are
but a few examples illustrating the complemen-
tary nature of behavior analysis and neurosci-
ence. Behavioral research indicates that contigu-
ity and discrepancy are required for selection by
reinforcement, and neuroscience is well on its
way to identifying the physiological mechanisms
that implement these requirements. The under-
standing of reinforcement arising from behav-
ioral research constrains the search for its neural
mechanisms to those that are capable of affecting
the strength of a wide range of E-B relations
while simultaneously confining the cumulative
effects of selection to the particular class of E-B
relations that precede the reinforcer. The widely
projecting DA system from the VTA to the
frontal lobes accommodates the potentially wide-
spread effects of selection by reinforcement; the
specificity of LTP to particular postsynaptic re-
ceptors satisfies the specificity requirement. In
addition, the same neural systems provide an in-
tegrated account of unconditioned and condi-
tioned reinforcement, with the latter forming a
critical element in the emergence of complex be-
havior. Our knowledge of the neural mechanisms
of reinforcement remains incomplete, of course,
particularly with regard to a detailed account of
the interactions between neurons in prefrontal
cortex, NAC, and VTA, but the discrepancy re-
quirement arising from behavioral research con-
tinues to constrain the search for these mecha-
nisms. For more complex behavior such as that
observed in the study of equivalence classes and
memory, behavioral research can offer verbal in-
terpretations (e.g., Donahoe & Palmer, 1994, pp.
139ff, 223ff) that draw upon observations of phe-
nomena that have been subjected to experimental
analysis in other contexts: The understanding of
equivalence classes can appeal to what is known
about joint stimulus control when conditioning
serial and compound discriminative stimuli; the
study of memory can appeal to covert behavior
that is consistent with what is known about overt
behavior in otherwise comparable situations. Be
that as it may, neuroscience offers the possibility
of supplementing these behavioral
15
interpreta-
tions with experimental analyses at the neural
level of observation: Pair-coded neurons provide
a mechanism whereby stimuli that bear no physi-
cal similarity to one another become equivalent at
the neural level and feedback from activity in pre-
frontal cortex to sensory-association cortex pro-
vides a mechanism whereby memories occur un-
der circumstances in which the contemporaneous
environment bears no specific similarity to the
environment in which the behavior was originally
acquired.
A strong case can be made that the inte-
gration of behavior analysis and neuroscience
portends a biobehavioral science whose implica-
tions for understanding complex behavior, in-
cluding human behavior, are as profound as the
earlier integration of the sciences of heredity and
genetics for understanding complex structure.
Skinner explicitly acknowledged “the spatial gap
between behavior and the variables of which it is
a function” and argued that this gap “can be filled
only by neuroscience, and the sooner… the bet
ter. Now seems a propitious time to vigorously
pursue that goal (Ortu & Vaidya, 2016).
1
References
Aggarwai, M., Hyland,B. I., & Wickens,J. R. (2012). Neural
control of dopamine transmission: Implications for rein-
forcement learning. European Journal of Neuroscience, 35,
1115-1123.
Albert, M. & Ayres, J. J. (1997). One-trial simultaneous and
backward excitatory fear conditioning in rats: Lick suppres-
sion, freezing, and rearing to CS compounds and their ele-
ments. Learning & Behavior, 25, 210-220.
Amaral, D. G. (1987). Memory: Anatomical organization of can-
didate brain regions. In V. B. Mountcastle (Ed.), Handbook
of physiology. Section I, The nervous system (Vol. V. Higher
functions of the brain. (pp. 211-294). Washington, D. C.:
American Physiological Society.
Arntzen, E. (2006). Delayed matching to sample and stimulus
equivalence: Probability of responding in accord with
equivalence as a function of different delays. The
Psychological Record, 56, 135–167..
Arntzen, E., Nartey, R. K., & Fields, L. (2015). Enhancing
responding in accordance with stimulus equivalence by the
delayed and relational properties of meaningful stimuli.
Journal of the Experimental Analysis of Behavior, 103, 524–
541.
Arntzen, E., Norbom, A., & Fields, L. (2015). Sorting: An
alternative measure of class formation? The Psychological
Record, 65, 615–625.
Balfour, M. E., Yu, L., & Coolen, L. M. (2004), Sexual behavior
and sex-associated environmental cues activate the meso-
limbic system in male rats. Neuropsychopharmacology. 29,
718-30.
Bauer, E. P., Schafe, G. E., & LeDoux, J. E. (2002). NMDA re-
ceptors and L-type voltage-gated calcium channels contrib-
ute to long-term potentiation and different components of
fear memory formation in the lateral amygdala. Journal of
Neuroscience, 22, 5239-5249.
Berridge, K. C. (2007). The debate over dopamine’s role in re-
ward: The case for incentive salience. Psychopharmacology,
191, 391-431.
Betts, S. L., Brandon, S. E., & Wagner, A. R. (1996). Disasso-
ciation of the blocking of conditioned eyeblink and condi-
tioned fear following a shift in US locus. Animal Learning
& Behavior, 24, 459-470.
Bickel, W. K. & Etzel, B. C. (1985). The quantal nature of con-
trolling stimulus-response relations in tests of stimulus gen-
eralization. Journal of the Experimental Analysis of Behav-
ior, 44, 245-270.
Bliss, T. V. & Lømø, T. (1973). Long-lasting potentiation of syn-
aptic transmission in the dentate area of the anaesthetized
rabbit following stimulation of the perforant path. Journal of
Physiology, 232, 331-356.
Bradfield, L. A., Bartran-Gonzales, J., Chieng, B, & Balleine, B.
W. The thalmostriatal pathway and cholinergic control of
goal-directed action: Interlacing new with existing learning
in the striatum. Neuron, 79, 153-156.
Brandon, S. E., Betts, S. L., & Wagner, A. R. (1994). Discrimi-
nated lateralized eyeblink conditioning in the rabbit: An ex-
perimental context for separating specific and general asso-
ciative influences. Journal of Experimental Psychology: An-
imal Behavior Processes. 20, 292-307.
Burdakov, D., Liss, B., & Ashcroft, F. M. (2003). Orexin excites
GABAergic neurons of the arcuate nucleus by activating the
sodium–calcium exchanger. Journal of Neuroscience, 23,
4957-4951.
Burgos, J. E. (2005). Theoretical note: The C/T ratio in artificial
neural networks. Behavioural Processes, 69, 249-256.
Burgos, J. E. & Donahoe, J. W. (2016). Unified principle of re-
inforcement in a neural-network model: Reply to N. T. Cal-
vin and J. J. McDowell. Behavioural Processes, 126, 46-54.
Calvin, O. L. & McDowell, J. J. (2016). Extending unified-the-
ory-of-reinforcement neural networks to steady-state oper-
ant behavior. Behavioural Processes,127, 52-61.
Catania, A. C. (1987). Some Darwinian lessons for behavior
analysis: A review of Bowler’s The eclipse of Darwinism.
Journal of the Experimental Analysis of Behavior, 47, 249-
257.
Catania, A. C. (1971). Reinforcement schedules: The role of re-
sponses preceding the one that produces the reinforcer. Jour-
nal of the Experimental Analysis of Behavior, 15, 271-287.
Chambers, K. C. (1990). A neural model for conditioned taste
aversions. Annual Review of Neuroscience, 13, 373-385.
Corbit, L. H. & Balleine, B. W. (2011). The general and out-
come-specific forms of Pavlovian-instrumental transfer are
differently mediated by the nucleus accumbens core and
shell. Journal of Neuroscience, 31, 11786-11794.
Creed, M. C., Ntamati, N. R., & Tan, K. R. (2014). VTA GABA
neurons modulate specific learning behaviors through the
control of dopamaine and cholinergic systems. Frontiers in
Behavioral Neuroscience, 8, 1-7.
Crowley, M. A. (1979). The allocation of time to temporally de-
fined behaviors: Responding during stimulus generalization.
Journal of the Experimental Analysis of Behavior, 32, 191-
197.
Crowley, M. A. & Donahoe, J. W. (2004). Matching: Its acqui-
sition and generalization. Journal of the Experimental Anal-
ysis of Behavior, 82, 143-159.
Dichter, G. S., Felder, J. N., Green, S. R., Ritenberg, A. M., Sas-
son, N. J., & Bodfish, J. W. (2012). Reward circuitry func-
tion in autism spectrum disorders. Social Cognitive and Af-
fective Neuroscience, 7, 160-172.
Dinsmoor, J. A. (1950). A quantitative comparison of the dis-
criminative and reinforcing functions of a stimulus. Journal
of Experimental Psychology, 40, 458-472.
Dobzhansky, T. G. (1937). Genetics and the origin of species.
New York: Columbia University Press.
Domjan, M. & Galef, B. G. (1983). Biological constraints on in-
strumental and classical conditioning: Retrospect and pro-
spect. Animal Learning & Behavior, 11, 151-161.
Donahoe, J. W. (1984). Skinner—The Darwin of ontogeny? Be-
havioral and Brain Sciences, 7, 487-488.
Mount..
2
Donahoe, J. W. (1997). Neural plasticity. In J. W. Donahoe & V.
P. Dorsel (Eds.), Neural-network models of cognition (pp.
80-81). New York: Elsevier Science Press
Donahoe, J. W. (1999). Edward L. Thorndike: The selectionist
connectionist. Journal of the Experimental Analysis of Be-
havior, 72, 451-454.
Donahoe, J. W. (2003). Selectionism. In Lattal, K.A. & Chase,
P.N. (Eds.) Behavior Theory and Philosophy (pp.103-128).
Dordrecht, Netherlands: Kluwer Academic Publishers.
Donahoe, J. W. (2010). Man as machine: A review of Memory
and the computational brain: Why cognitive science will
transform neuroscience. Behavior and Philosophy, 38, 79-
97.
Donahoe, J. W. (2012). Origins of the molar-molecular debate.
European Journal of Behavior Analysis, 13, 195-200.
Donahoe, J. W. (2013). Theory and behavior analysis. Behavior
Analyst, 36, 361-371.
Donahoe, J. W, & Burgos, J. E. (2005). Selectionism: Complex
outcomes from simple processes. Behavioural and Brain
Sciences, 28, 429-430.
Donahoe, J. W., Burgos, J. E., & Palmer, D. C. (1993). A selec-
tionist approach to reinforcement. Journal of the Experi-
mental Analysis of Behavior, 60, 17-40.
Donahoe, J. W., Crowley, M. A., Millard, W. J., & Stickney, K.
A. (1982). A unified principle of reinforcement: Some im-
plications for matching. In M. L. Commons, R. J. Herrnstein,
& H. Rachlin (Eds.), Quantitative analyses of behavior: Vol.
2. Matching and maximizing accounts (pp. 493-521). Cam-
bridge, MA: Ballinger.
Donahoe, J. W. & Palmer, D. C. (1993/2005). Learning and
complex behavior. Boston: MA: Allyn & Bacon. (Reprinted
Richmond, MA: Ledgetop Publishing) http://lcb-online.org
Donahoe, J. W. & Palmer, D. C., & Burgos, J. E. (1997). The S-
R issue: Its status in behavior analysis and Donahoe and
Palmer’s Learning and Complex Behavior. (1997). Journal
of the Experimental Analysis of Behavior, 67, 193-211.
Donahoe, J. W., & Vegas, R. (2004). Pavlovian conditioning:
The CS-UR relation. Journal of Experimental Psychology:
Animal Behavior Processes, 30, 17-33.
Donahoe, J. W. & Vegas, R. (in press). Respondent (Pavlovian)
conditioning. In W. W. Fisher, C. C. Piazza, and H. S. Roane
(Eds.), Handbook of applied behavior analysis (2nd edition).
New York: Guilford Press.
Donahoe, J. W. & Wessells, M. G. (1980). Learning, language,
and memory. New York: Harper & Row.
Dreyer, J. K., Vander-Weele, C. M., & Lovic, V. (2016). Func-
tionally distinct dopamine signals in nucleus accumbens
core and shell in the freely moving rat. Journal of Neurosci-
ence, 36, 96-112.
Duncan, K., Ketz, N., Inati, S. J., & Davachi, L. (2012). Evi-
dence for area CAi as a match/mismatch detector: A high-
resolution fMRI study of the human hippocampus. Hippo-
campus, 22, 389-398.
Dworkin, B. R. (1993). Learning and physiological regulation.
Chicago: University of Chicago Press.
Dworkin, B. R., & Miller, N. E. (1986). Failure to replicate vis-
ceral learning in the acute curarized rat preparation. Behav-
ioral Neuroscience, 100, 299-314.
Eickelboom, R. & Stewart, J. (1982). Conditioning of drug-in-
duced physiological responses. Psychological Review, 89,
507-528.w
Emery, N. J., Lorincz, E. N., Perrett, D. I., Oram, M. W., &
Baker, C. I. (1997). Gaze following and joint attention in
rhesus monkeys (Macaca mulatto). Journal of Comparative
Psychology, 111, 286-293.
Estes, W. K. (1956). The problem of inferences from curves
based on group data. Psychological Bulletin, 53, 134-140.
Estes, W. K., Burke, C. J., Atkinson, R. C. (1957). Probabilistic
discrimination learning. Journal of Experimental Psychol-
ogy. 54, 233-239.
Estes, W. K. & Skinner, B. F. (1941). Some quantitative proper-
ties of anxiety. Journal of Experimental Psychology, 29,
390-400.
Ferster, C. B. & Skinner, B. F. (1957). Schedules of reinforce-
ment. New York: Appleton-Century-Crofts.
Fields, L., Reeve, K. F., Varelas, A., Rosen, D., & Belanich, J.
(1997). Equivalence class formation using stimulus-pairing
and yes-no. The Psychological Record, 47, 661–686.
Frey, U. (1997). Cellular mechanisms of long-term potentiation:
Late maintenance. In Donahoe, J. W. and Dorsel, V. P.
(Eds.). Neural-network models of cognition: Biobehavioral
foundations (pp. 105-128). New York: Elsevier Science
Press.
Frey, U, & Morris, R. G. M. (1997). Synaptic tagging and long-
term potentiation. Nature 385, 533–536.
Fujimichi, R., Naya, Y., & Miyashita, Y. (2005). Neural basis of
semantic-like memory: Representation, activation, and cog-
nitive control. In M. S. Gazaniga (Ed.), i (pp. 905-918),
Cambridge, MA: MIT Press. http://www.physiol.m.u-to-
kyo.ac.jp/en/review/review_en_03.html
Fuster, J. (2015). The prefrontal cortex (9th edition). New York:
Academic Press.
Gao, X-B & Horvath, T. (2014). Function and dysfunction of
hypocretin.orexin: An enegertics point of view. Annual Re-
view of Neuroscience, 37, 101-116.
Gallistel, C. R. & King, A. P. (2009). Memory and the computa-
tional brain: Why cognitive science will transform neurosci-
ence. New York: Wiley.
Gasbarri, A., Verney, C., Innocenzi, R., Campana, E., Pacitti, C.
(1994). Mesolimbic dopaminergic neurons innervating the
hippocampal formation in the rat: a combined retrograde
tracing and immunohistochemical study. Brain Research,
668, 71–79
Gayon, J. (1998). Darwin’s struggle for survival: Heredity and
the hypothesis of natural selection. (Translated by M. Cobb
from Darwin et l’aprés-Darwin: une histoire de l’hypothèse
de sélection naturalle. Paris: Editions Kimé, Paris, 1992.)
Cambridge, UK: Cambridge University Press.
Geisler, S., Derst, C., Veh, R. W., & Zahm, D. S. (2007). Glu-
tamatergic afferents of the ventral tegmental area in the rat.
Journal of Neuroscience, 27, 5730–5743.
Georgopoulos, A. P., Schwartz, A. B., & Kettner, R. E. (1986).
Neural population coding of movement direction. Science,
233, 1416-1419.
Gerfen, C. R. & Surmeier, D. J. (2011). Modulation of striatal
projection neurons by dopamine. Annual Review of Neuro-
science, 34, 441-466.
Gibbon, J. & Balsam, p. D. (1981). The spread of association in
time. In c. Locurto & H. S. Terrace (Eds.), Conditioning the-
ory. New York: Academic Press.
Gormezano, I. (1966). Classical conditioning. In J. B. Sidowski
(Ed.), Experimental methods and instrumentation in psy-
chology. New York: McGraw-Hill.
Add 2nd
edition
3
Grace, A. A. & Bunney, B. S. (1984). The control of firing pat-
tern in nigral dopamine neurons: Burst firing. Journal of
Neuroscience, 4, 2877-2890.
Grace, A. A., Floresco, S. B., Goto, Y., & Lodge, D. J. (2007).
Regulation of firing of dopaminergic neurons and control of
goal-directed behaviors. Trends in Neuroscience, 30, 220 –
227.
Grisante, P. C., Galesi, F. L., Sabino, N. M., Debert, P., Arntzen,
E., & McIlvane, W. J. (2013). Go/no-go procedure with
compound stimuli: Effects of training structure on the
emergence of equivalence classes. The Psychological
Record, 63, 63–72.
Guttman, N. & Kalish, H. J. (1956). Discriminabilty and stimu-
lus generalization, Journal of Experimental Psychology, 51,
79-88.
Gynther, M. D. (1957). Differential eyelid conditioning as a
function of stimulus similarity and strength of response to
the CS. Journal of Experimental Psychology, 53, 408-416.
Hebb. D. O. (1949). The Organization of Behavior. New York:
Wiley.
Herrnstein, R. J., Loveland, D. H., & Cable, C. (1976). Natural
concepts in pigeons. Journal of Experimental Psychology:
Animal Behavior Processes, 4, 285-302.
Jensen, G., Ward, R. D., & Balsam, P. D. (2013). Information:
Theory, brain, and behavior. Journal of the Experimental
Analysis of Behavior, 100, 408-431.
Johnson, D. F. (1970). Determiners of selective stimulus control
in the pigeon. Journal of Comparative and Physiological
Psychology, 76, 298-307.
Johnson, S. W., Seutin, V., & North, R. N. (1992). Burst firing
in dopamine neurons induced by N-methyl-D-Aspartate:
Role of electrogenic sodium pump. Science, 258, 665-667.
Ju, W., Morishita, W., Tsui, J., Gaiette, G., Deerinck, T. J., Ad-
ams, S. R., Garner, C. C., Tsien, R. Y., Ellisman, M. H., &
Malenka, R. C. (2004). Activity-dependent regulation of
dendritic synthesis and trafficking of AMPA receptors. Na-
ture Neuroscience,7, 244-253.
Kamin, L. J. (1968). "Attention-like" processes in classical con-
ditioning. In M. R. Jones (Ed.), Miami symposium on the
prediction of behavior (pp. 9-31). Coral Gables, FL: Univer-
sity of Miami Press.
Kamin, L. J. (1969). Predictability, surprise, attention and con-
ditioning. In B. A. Campbell & R. M. Church (Eds.), Pun-
ishment and aversive behavior (pp. 279-296). New York:
Appleton-Century-Crofts.
Kastak, C. R., Schusterman, R. J., & Kastak, D. (2001). Equiva-
lence classification by California sea lions using class-spe-
cific reinforcers. Journal of the Experimental Analysis of Be-
havior, 76, 131-156.
Killeen, P. R. (2015). The logistics of choice. Journal of the Ex-
perimental Analysis of Behavior, 104, 74-92.
Kirk, P. D. W., Babtie, A. C., Stumpf, M. P. H. (2015). Systems
biology (un)certainties. Science, 350, 387–388.
Kumaran, D., Hassabis, D., & McClelland, J. L. (2016). What
learning systems do intelligent agents need? Complementary
learning systems theory updated. Trends in Cognitive Sci-
ences, 20, 512-534
Lalive, A. L., Munoz, M. B., Bellone, C., Slesinger, P. A.,
Christian Luscher, P. A., & Tan, K. R. (2014). Firing Modes
of Dopamine Neurons Drive Bidirectional GIRK Channel
Plasticity. Journal of Neuroscience, 34, 5107-5114.
Lattal, K. A & Gleeson, S. (1990). Response acquisition with
delayed reinforcement. Journal of Experimental Psychol-
ogy: Animal Behavior Processes, 16, 27-39.
Leader, G., Barnes, D., & Smeets, P. M. (1996). Establishing
equivalence relations using a respondent-type training
procedure. The Psychological Record, 46, 685–706.
Lee, C. R. & Tepper, J. M. (2009). Basal ganglia control of
substantia nigra dopaminergic neurons. Journal of Neural
Transmission Supplement, 73, 71-90.
Leyland, C. M. & Mackintosh, N. J. (1978). Blocking of first-
and second-order autoshaping in pigeons. Animal Learning
& Behavior, 6, 392-394.
Li, S., Cullen, W. K., Anwyl, R., & Rowan, M. J. (2003). Dopa-
mine-dependent facilitation of LTP induction in hippocam-
pal CA1 by exposure to spatial novelty. Nature Neurosci-
ence, 6, 326-331.
Li, Z. S., Schmauss, C., Cuenca, A., Ratcliffe, E., & Gershon,
M. D. (2006). Physiological modulation of intestinal motil-
ity by enteric dopaminergic neurons and the D2 receptor:
Analysis of dopamine receptor expression, location, devel-
opment, and function in wild-type and knock-out mice. Jour-
nal of Neuroscience, 26, 2798 –2807.
Lionello-DeNolf, K. M. (2009). The search for symmetry.
Learning & Behavior, 37, 18-203.
MacDonall, J. S. (2009). The stay/switch model of concurrent
choice. Journal of the Experimental Analysis of Behavior,
91, 21-39.
Marr, M. J. (1992). Behavior dynamics: One perspective. Jour-
nal of the Experimental Analysis of Behavior, 57, 249-266.
Martin, S. J., Grimwood, P. D., & Morris, R. G. M. (2000). Syn-
aptic plasticity and memory: An evaluation of the hypothe-
sis. Annual Review of Neuroscience,23, 649-711.
McDowell, J. J. (2013a). A quantitative evolutionary theory of
adaptive behavior dynamics. Psychological Review, 120,
731-750.
McDowell, J. J. (2013b). Representations of Complexity: How
Nature Appears in Our Theories. Behavior Analyst, 36, 345-
359.
McNish, K. A., Betts, S. L., Brandon, S. E., & Wagner,A. R.
(1997). Divergence of conditioned eyeblink and conditioned
fear in backward Pavlovian training. Animal Learning & Be-
havior, 25, 43-62.
Migler, B. (1964). Effects of averaging data during stimulus gen-
eralization. Journal of the Experimental Analysis of Behav-
ior, 7, 303-307.
Migler, B. & Millenson, J. R. (1969). Analysis of response rates
during stimulus generalization. Journal of the Experimental
Analysis of Behavior, 12, 81-87.
Moorman, D. E. & Aston-Jones, G. (2010). Oreixin/hypocretin
modulates response of ventral tegmental dopamine neurons
to prefrontal activation: Diurnal influences. Journal of Neu-
roscience, 17, 15585-15599.
Morse, W. H., & Skinner, B. F. (1957). A second type of super-
stition in the pigeon. American Journal of Psychology,70,
308-311.
Oakley, D. A., & Russell, I. S. (1972). Neocortical lesions and
Pavlovian conditioning. Physiology & Behavior, 8, 915-926.
O’Keefe, J. & Nadel, L. (1978). The hippocampus as a cognitive
map. Oxford, UK: Oxford University Press.
Ortu, D. & Vaidya, M. (2016). The challenges of integrating be-
havioral and neural data: Bridging and breaking boundaries
across levels of analysis. The Behavior Analyst, 39, 1-16.
Italicize
Science
4
Palmer, D. C. (1986). The blocking of conditioned reinforce-
ment. Unpublished doctoral dissertation. University of Mas-
sachusetts, Amherst, MA.
Palmer, D. C. (1991). A behavioral interpretation of memory. In
L. J. Hayes & P. N. Chase (Eds.), Dialogues in Verbal Be-
havior.(pp. 261-279). Reno, NV: Context Press.
Palmer, D. C. (2011). Consideration of private events is required
in a comprehensive science of behavior. Behavior Analyst,
34, 201-207.
Palmer, D. C. & Donahoe, J. W. (1992). Essentialism and selec-
tionism in cognitive science and behavior analysis. Ameri-
can Psychologist, 47, 1344-1358.
Papini, M. R. & Bitterman, M. E. (1990). The role of contin-
gency in classical conditioning. Psychological Review, 97,
396-403.
Pavlov, I. P. (1960). Conditioned Reflexes. New York: Dover
Press.
Pears, A., Parkinson, J. A., Hopewell, L., Everitt, B. J., & Rob-
erts, . C. (2003). Lesions of the orbitofrontal but not medial
prefrontal cortex disrupt conditioned reinforcement in pri-
mates. Journal of Neuroscience, 23, 11189 –1120.
Plenz, D. & Wickens, J. (2010). The striatal skeleton: Medium
spiny projection neurons and their lateral connections. In H.
Steiner and K. Y. Tseng (Eds.). Handbook of behavioral
neuroscience, Vol. 20 (pp. 99-112). New York: Elsevier Sci-
ence Press.
Pliskoff, S. S. (1971). Effects of symmetrical and asymmetrical
changeover delays on concurrent performances. Journal of
the Experimental Analysis of Behavior, 16, 249–256.
Rescorla, R. A. (1968). Probability of shock in the presence and
absence of CS in fear conditioning. Journal of Comparative
and Physiological Psychology, 66, 1-5.
Rescorla, R. A. & Wagner, A. R. (1972). A theory of Pavlovian
conditioning: Variations in the effectiveness of reinforce-
ment and nonreinforcement. In A. H. Black & W. F. Prokasy
(Eds.), Classical conditioning: Current research and theory
(pp. 64-99). New York: Appleton-Century-Crofts.
Reynolds, J. N. J., Hyland, B. I., & Wickens, J. R. (2001). A
cellular mechanism for learning. Nature, 413, 67-70.
Salamone, J. D., Correa, M., Nunes, E. J., Randall, P. A., &
Pardo, M. (2013). The behavioral pharmacology of effort-
related choice behavior: Dopamine, adenosine, and beyond,
Journal of the Experimental Analysis of Behavior, 97, 97-
125.
Sakai, K. & Miyashita, Y. (1991). Neural organization for the
long-term memory of paired associates. Nature, 354, 152-
155.
Sanes, J. N. & Donoghue, J. P. (2000). Plasticity and primary
motor cortex. Annual Review of Neuroscience, 23, 393-415.
Scheuerman, K. V., Wildemann, D. G., & Holland, J. G. (1978).
A clarification of continuous repertoire development. Jour-
nal of the Experimental Analysis of Behavior, 30, 197-203.
Schultz, W. (1997). Adaptive dopaminergic neurons report the
appetitive value of environmental stimuli. In J. W. Donahoe
& V. P. Dorsel (Eds.), Neural-network models of condition-
ing (pp. 317-35). New York: Elsevier Science Press.
Schultz, W., Apicella, P., & Ljungberg,T. (1993). Conditioned
stimuli during successive steps of learning a delayed re-
sponse task. Journal of Neuroscience, 13, 900-913.
Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural sub-
strate of prediction and reward. Science, 275, 1593-1599.
Seidenbecher, T., Balschun, D., & Reymann, K. G. (1995).
Drinking after water deprivation prolongs unsaturated LTP
in the dentate gyrus of rats. Physiology & Behavior. 57,
1001–1004.
Shahan, T. A. (2010). Conditioned reinforcement and response
strength. Journal of the Experimental Analysis of Behavior,
93, 269–289.
Skinner, B. F. (1948). "Superstition" in the pigeon. Journal of
Experimental Psychology, 38, 168-172.
Sidman, M. (1960). Tactics of scientific research. New York:
Basic Books.
Sidman, M. (2000). Equivalence relations and the reinforcement
contingency. Journal of the Experimental Analysis of Behav-
ior, 74, 127-146.
Sidman, M. & Tailby, W. (1982). Conditional discrimination vs.
matching tosample: An expansion of the testing paradigm.
Journal of the Experimental Analysis of Behavior, 37, 5-22.
Sidman, M. Rauzin, R., Lazar, R., Cunningham, S., Tailby, W.,
& Carrigan, P. (1982). A search for symmetry in the condi-
tional discrimnations of rhesus monkeys, baboons, & chil-
dren. Journal of the Experimental Analysis of Behavior, 37,
23-44.
Siegel, S & Allan, L. G. (1996). The widespread influence of the
Rescorla-Wagner model. Psychonomic Bulletin & Review,
3, 314-3321.
Silva, F. J., Timberlake, W., & Cevik, M. O. (1998). A behavior
systems approach to the expression of backward associa-
tions. Learning & Motivation, 29, 1-22.
Singer, W. (1997). Development and plasticity of neocortical
processing architecttres. In J. W. Donahoe & V. P. Dorsel
(Eds.), Neural-network models of cognition (pp. 142-159).
New York: Elsevier Science Press.
Skinner, B. F. (1937). Two types of conditioned reflex:a REPLY
YOT Konorski and Miller. General Psychology, 16, 22-279.
Skinner, B. F. (1948). "Superstition" in the pigeon. Journal of
Experimental Psychology, 38, 168-172.
Skinner, B. F. (1950). The behavior of organisms. New York:
Appleton-Century-Crofts.
Skinner, B. F. (1974). About behaviorism. New York: Random
House.
Smith, M. C., Coleman, S. P. & Gormezano, I. (1969). Classical
conditioning of the rabbit nictitating membrane response at
backward, simultaneous, and forward CS-US intervals.
Journal of Comparative and Physiological Psychology, 69,
226-231.
Smith, D. V., Liu, H., & Vogt, M. B. (1996). Responses of gus-
tatory cells in the nucleus of the solitary tract of the hamster
after NaCl or miloride adaptation. Journal of Neurophysiol-
ogy, 76, 47–58.
Solomon, R. L. & Corbit, J. D. (1974). An opponent-process the-
ory of motivation: I. Temporal dynamics of affect. Psycho-
logical Review,81, 119-145.
Solomon, R. L. & Turner, L. H. (1962). Discriminative classical
conditioning in dogs paralyzed by curare can later control
conditioned avoidance responses in the normal state. Psy-
chological Review, 69, 202-219.
Squire, L. R. (2004). Memory systems of the brain: A brief his-
tory and current perspective". Neurobiology of Learning and
Memory. 82, 171–177.
Staddon, J. E. R. (1983). Adaptive behavior and learning. New
York: Cambridge University Press.
5
Staddon, J. E R. (2014). On choice and the Law of Effect. Inter-
national Journal of Comparative Psychology, 27, 569-584.
Staddon, J. E. R. & Simmelhag, V. L. (1971). The “superstition”
experiment: A reexamination of its implications for the prin-
ciples of adaptive behavior. Psychological Review, 78, 3-43.
Stein, L., Xue, B. G., & Belluzzi, J. D. (1993). A cellular ana-
logue of operant conditioning. Journal of the Experimental
Analysis of Behavior, 60, 41-55.
Stickney, K. J., Donahoe, J. W., & Carlson, N. R. (1981). Con-
ditining preparation for the nictitating membrane of the pi-
geon. Behavior Research Methods & Instrumentation, 13,
633-656.
Stickney, K., & Donahoe, J. W. (1983). Attenuation of blocking
by a change in US locus. Animal Learning & Behavior, 11,
60-66.
Stryker, M. P. (1986). Binocular impulse blockade prevents the
formation of ocular dominance columns in cat visual cortex.
Journal of Neuroscience, 6, 2117-2133.
Sutton, R. S. & Barto, A. G. (1981). Toward a modern theory of
adaptive networks: Expectation and prediction. Psychologi-
cal Review, 88, 135-170.
Swanson, L. W. & Kohler, C. (1986). Anatomical evidence for
direct projections from the entorhinal to the entire cortical
mantle in the rat. Journal of Neuroscience, 6, 3010-3023.
Thorndike, E. L. (1903). Elements of psychology. New York: A.
G. Seiler.
Tomita, H., Ohbayashi, M., Nakahara, K., Hasegawa, I. &
Miyashita, Y. (1999) Top-down signal from prefrontal cor-
tex in executive control of memory retrieval. Nature, 401,
699-703.
Tong, Z. Y., Overton, P. G., Clark, D., (1996). Stimulation of the
prefrontal cortex in the rat induces patterns of activity in
midbrain dopaminergic neurons which resemble natural
burst events. Synapse. 22, 195–208.
Trowill, J. A. (1967). Instrumental conditioning of the heart rate
in the curarized rat. Journal of Comparative and Physiolog-
ical Psychology, 63, 7-11.
Urcuioli, P. J. Associative symmetry, antisymmetry, and a the-
ory of pigeons’ equivalence-class formation. Journal of the
Experimental Analysis of Behavior, 90, 257-282.
Volkow, N. D., Fowler, J. S., & Wang, G-J. (2007). Dopamine
in drug abuse and addiction. Archives of Neurology, 64,
1575-1579.
Neurological Review,r andom Saal, W. & Jenkins, H. M. (1970).
Blocking the development of stimulus control. Learning &
Motivation, 1, 52-64.
Waelti, P., Dickinson, A., & Schultz, W. (2001). Dopamine re-
sponses comply with basic assumptions of formal learning
theory. Nature, 412, pp.43-48.
Williams, B. A. (1975). The blocking of reinforcement control.
Journal of the Experimental Analysis of Behavior, 24, 215-
226.
Williams, B. A. (1994a). Conditioned reinforcement: Neglected
or outmoded explanatory construct? Psychonomic Bulletin
& Review, 1, 457–475.
Williams, B. A. (1994b). Conditioned reinforcement: Experi-
mental and theoretical issues. Behav. Anal., 17, 261-285.
Wilson, D. I. & Bowman, M. (2004). Nucleus accumbens neu-
rons in the rat exhibit differential activity to conditioned re-
inforcers and primary reinforcers within a second-order
schedule of saccharin reinforcement. European Journal of
Neuroscience, 20, 2777-2788.
Wise, R. A. (2002). Brain reward circuitry: Insights from un-
sensed incentives. Neuron, 36, 229-240.
Yagishita, S., Hayashi-Takagi, A., Ellis-Davies, G. C. R., Ura-
kubo, H., Ishii, S., & Kasai1, H. (2014). A critical time win-
dow for dopamine actions on the structural plasticity of den-
dritic spines. Science 345, 1616-1620.
Zentall, T. R., Wasserman, E. A., & Urcuioli,, P. J. (2013). As-
sociative concept learning in animals. Journal of the Exper-
imental Analysis of Behavior, 101, 130-151.
6
Endnotes
1
The body of the article is self-contained, but supplementary technical information is occasionally intro-
duced by means of endnotes.
2
“Experimental analysis,” as Skinner used the term, referred to circumstances in which all of the effects
of the variables relevant to a phenomenon are either manipulated, measured, or controlled. This idealized
set of circumstances is only approximated even in the laboratory. Classical procedures in which the condi-
tioned response and the elicited response closely resemble one another—such as the salivary response of
the dog (Pavlov, 1927/1960) and the nictitating-membrane response of the rabbit (Gormezano, 1966) or
the pigeon (Stickney, Donahoe, & Carlson, 1981)—very closely approximate these requirements. How-
ever, other classical procedures reveal the more complex outcomes that can be produced by the condition-
ing process. As one example, the conditioned-suppression procedure devised by one of Skinner’s students
(Estes & Skinner, 1941) demonstrated that an unmonitored unconditioned response (the response to
shock) may interfere with the execution of a monitored operant (bar pressing), thereby providing an indi-
rect but sensitive measure of conditioning. This procedure eventually led to the important discovery of the
discrepancy requirement, which is described subsequently (Kamin, 1968; Rescorla, 1968). As a second
example of a complex outcome produced by a classical procedure, the response measured after condition-
ing may sometimes appear to be the opposite to the response to the US: For example, a conditioned stim-
ulus paired with the injection of glucose produces a decrease in circulating glucose when the stimulus is
presented alone. However, the true unconditioned response to the injection of glucose is not an increase in
circulating glucose, but a decrease the glucose liberated from glucose-storing organs when neurons detect
an increase in exogenous glucose (see Eickelboom & Stewart, 1982). In general, pre-existing homeostatic
and other mechanisms resulting from natural selection must be taken into account when interpreting the
results of any conditioning procedure (cf. Dworkin, 1993; Solomon & Corbit, 1974).
3
The statement that conditioning in the classical procedure occurs over only brief temporal intervals be-
tween the CS and the reinforcer is based on procedures in which the reinforcer-elicited response has a
short latency and brief duration. Findings reported later indicate that it is not the temporal relation of the
CS to the reinforcing stimulus per se that is critical but the relation of the CS to the behavior produced by
the reinforcer. For example, in taste aversions a novel taste (CS) followed by the ingestion of a nonlethal
poison becomes aversive even though the behavioral effects (Relicited) of ingestion occur some hours later.
In the evolutionary history of organisms, gustatory and olfactory stimuli have inevitably preceded the gas-
tric consequences of ingestion. This has promoted the selection of neural structures relating such stimuli
to gastric responses. Neural traces of these stimuli endure and are then contiguous with the gastric conse-
quences (Chambers, 1990). In respects other than the temporal disjunction between the taste CS and the
gastric Relected to the poison US, findings from the conditioning of taste aversions are consistent with those
from other conditioning preparations (Domjan & Galef, 1983).
4
Several other studies conducted around the time of these critical experiments pointed toward the insuffi-
ciency of temporal contiguity for conditioning (e.g., Johnson, 1970), but their implications were not fully
appreciated (cf. Williams, 1975).
5
The following section of the paper takes the position that an analysis of moment-to-moment events re-
veals the dynamical processes of which relations between environmental and behavioral events observed
over more extended periods of time are the product. It is generally agreed that stable molar relations, as
useful as they may be for some purposes, are silent with respect to the more “molecular” processes of
which they are the asymptotic expression (e.g., Donahoe, 2012; Marr, 1992; Staddon, 2014). In addition,
studies indicate that at least some molar E-B relations, such as matching, may be the concerted product of
7
multiple discriminated operants (e.g., Crowley & Donahoe, 1992; Killeen, 2015; MacDonall, 2009; Plis-
koff, 1971). Indeed, computer simulations that implement moment-to-moment processes have yielded as-
ymptotic results that reproduce some molar relations such as matching (e.g., McDowell, 2013a; Calvin &
McDowell, 2016) and the effects of varying the C/T ratio on acquisition (Burgos, 2005), where C is the
length of the CS-US interval and T is the total session length (Gibbon & Balsam, 1981). A conceptually
independent but often conflated difference between theoretical approaches to behavior theory concerns
whether it is sufficient for a theory to serve as a guide for the behavior of the theorist (e.g., Jensen, Ward,
& Balsam, 2013; Gallistel & King, 2009; cf. Donahoe, 2010) or whether theory should also embody the
biobehavioral processes that underlie the functional relations captured by the theory (McDowell, 2013b;
cf., Donahoe, 2013).
6
Although dopamine plays a critical role in reinforcement, not all instances of long-lasting changes in
synaptic plasticity are dopamine-dependent (e.g., Bauer, Schafe, & LeDoux, 2002).
7
The conditioning of autonomic responses with operant procedures has been notoriously difficult to
achieve (Dworkin & Miller, 1986). Autonomic responses are mediated at the level of the midbrain and
need not involve cortical mechanisms, including the circuits involved with conditioned reinforcement de-
scribed here (e.g., Oakley & Russell, 1972). Computer simulations indicate that conditioned reinforce-
ment is needed to bridge the temporal gap between the operant and the reinforcer (Donahoe & Burgos,
2005). These circuits are not available at the level of the midbrain. It is of interest that the few studies that
have successfully demonstrated operant conditioning of autonomic response have minimized delay of re-
inforcement by using immediate electrical stimulation of the brain as a reinforcer (e.g., Trowill, 1967).
8
Deficiencies in the neural mechanisms of conditioned reinforcement have been proposed as an important
factor in the behavioral deficits found along the autism spectrum (Donahoe & Vegas, in press). In one study,
diminished activity in the NAC to a visual conditioned reinforcer was found with autistic subjects (Dichter,
Felder, Green, Ritenberg, Sasson, Bodfish, 2012). If the numbers and sources of connections from regions
in prefrontal cortex to NAC and, possibly, VTA are reduced for those with autism, then the potential for
stimuli that activate those regions to serve as conditioned reinforcers would be impaired. As an example,
suppose that neurons in the region of the prefrontal cortex that receive inputs from the sensory areas in-
volved in face perception do not have connections to NAC. Under these circumstances, seeing human faces
would not serve as a source of conditioned reinforcement. Instead, eye contact with another person would
be viewed as a threat gesture, the typical reaction in other primates. (Emery, Lorincz, Perrett, Orm, & Baker,
1997).
9
The present discussion of neural mechanisms describes the major systems underlying reinforcement.
Although much is known of these systems and their underlying processes, much remains to be known
(e.g., Gerfen & Surmeier, 2011). As an example relating to the DA-ergic mechanisms of the discrepancy
requirement, DA neurons in the VTA are maintained at their low baseline (tonic) rates of firing by inhibi-
tory neurons acting on VTA neurons from the output of NAC. (Lalive, Munoz, Bellone, & Slesinger,
2014). Unconditioned reinforcers drive the VTA neurons sufficiently to overcome this tonic inhibition.
The onset of conditioned reinforcers activate Glu pathways from prefrontal cortex to neurons in NAC and
these increase DA activity by stimulation of DA VTA neurons (Geizler, Derst, Veh, & Zahn, 2007)
and/or by inhibiting inhibitory neurons from NAC to VTA, thereby briefly liberating DA from VTA neu-
rons before tonic inhibition is reinstated (Aggarwai, Hyland, & Wickens, 2012; Creed, Ntamati, & Tan,
2014; Moorman & Aston-Jones, 2010; Tong, Overton, & Clark, 1996). As an additional complication,
inputs from the hypothalamus to the region of the VTA co-release the neuropeptide orexin/hypocretin
which provides a mechanism whereby the state of deprivation affects DA VTA activity (cf. Burdakov,
Liss, & Ashcroft, 2003; Gao & Horvath, 2014; Seidenbecher, Balschun, & Reymann, 1995). Finally,
NAC is itself not a homogeneous structure; the core and shell of the nucleus are differently innervated
8
(Dreyer, Vander-Weele, & Lovic, 2016) and have somewhat different functions (Corbit & Balleine,
2011). (For a different view of the place of DA in learned behavior see Berridge, 2007).
10
The behavioral expression of the UR is not necessary for conditioning as demonstrated by cases in
which the behavioral UR is prevented (as when transmission at the neuromuscular junction is blocked;
e.g., Solomon & Turner, 1962) or by cases in which the environment does not support expression of the
UR (as when an auditory CS is paired with food for the pecking response of the pigeon in an autoshaping
experiment; e.g., Leyland & Mackintosh, 1978). The response elicited by the reinforcer is the behavioral
event that is most closely synchronized with the occurrence of the DA-ergic neural mechanisms of rein-
forcement. (Donahoe & Vegas, 2004). However, once these neural mechanisms had been naturally se-
lected their behavioral expression is not necessary for the environment to engage them.
11
DA receptors of the type involved here are coupled to a particular G protein that catalyzes the synthesis
of cAMP (cyclic adenosine monophosphate), which is the starting point for a series of intracellular second
messengers that stimulate protein synthesis. Studies have shown that the injection of cAMP into a cell
with tagged Glu receptors produces LTP of those receptors even though the cell was not depolarized suf-
ficiently to otherwise cause LTP. LTP may result from prolonging the opening of ion channels of Glu re-
ceptors or by producing additional Glu receptors (so called “latent” receptors) in the postsynaptic mem-
brane (Ju, Morishita, Tsui, Gaiette, Deerinck, Adams, Garner, Tsien, Ellisman, & Malenka, 2004). See
Frey, 1997 and Martin, Grimwood, & Morris, 2000 for additional information. As indicated later, neuro-
modulators other than DA can enable LTP.
12
The presentation has focused on the effect of DA arising from the VTA that acts on neurons in the pre-
frontal cortex and NAC. These projections make up the mesocortical and mesolimbic DA systems, re-
spectively. Another system of DA projections arises from the substantia nigra pars compacta (SN), a
midbrain nucleus adjacent to the VTA. The SN system projects to a region called the striatum, the region
referenced in this footnote. DA-ergic neurons in the SN are also activated by reinforcing stimuli (Lee &
Tepper, 2009). In addition to these DA-ergic inputs, neurons in the striatum receive converging inputs
from motor regions of the prefrontal cortex and from the thalamus (Plenz & Wickens, 2010). Striatal neu-
rons are thus well positioned to integrate activity from their sensory thalamic inputs with their reinforced
prefrontal motor inputs (cf. Bradfield, Bertran-Gonzalez, Chieng, & Baleine, 2013). More “downstream”
motor systems then lead to observable behavior. A provocative speculation is that DA from the VTA acts
upon neural circuits in prefrontal cortex that specify those striatal neurons that are most active prior to re-
inforcement and that DA from SN then strengthens connections from sensory inputs to those co-active
striatal neurons. As the result of such a process, reinforced E-B relations could ultimately be mediated be-
fore fully engaging prefrontal activity, for example the activity mediating subvocal verbal behavior (i.e.,
consciousness). Through this “short-circuiting” process, such behavior would become “automatic” (Do-
nahoe, 1997, p. 354).
13
A wide variety of conditioning phenomena have been simulated in neural-network research that imple-
ments the foregoing mechanisms of reinforcement using networks whose architecture is consistent with
the neural systems described here (see Burgos & Donahoe, 2016).
14
DA-dependent LTP was first discovered in CA1 neurons (Bliss & Lomo, 1973). While DA enables
LTP at CA1 synapses, other agents play a critical role at synapses within the trisynaptic circuit of the hip-
pocampus (e.g., Stein, Xue, & Belluzzi, 1993). In S-A cortex, neurotransmitters such as noradrenaline,
acetylcholine, and/or serotonin play a role that is functionally similar to that of DA in fostering LTP in
prefrontal cortex and CA1 synapses (Singer, 1997). Early studies of the hippocampus documented its role
in spatial discriminations (O’Keefe & Nadel, 1978). Spatial discriminations require integrating multiple
9
sensory inputs to specify a particular location within the physical environment. However, the role of the
hippocampus is much more general in that it integrates multisensory inputs in a wide variety of situations,
with spatial discrimination being but one example.
15 A considerable array of procedures other than standard matching-to-sample procedures has provided
evidence of the formation of equivalence classes (e.g., Arntzen, 2006; Arntzen, Nartey, & Fields, 2015,
Arntzen, Norbom & Fields, 2015; Fields, Reeve, Varelas, Rosen, & Bananich, 1979; Grisante, Galesi,
Sabino, Debert, & Arntzen, 2013; Leader, Barnes, & Smeets, 1996). I thank Prof. Erik Arntzen for
providing some of these references.