Content uploaded by Lee Hogarth
Author content
All content in this area was uploaded by Lee Hogarth on Mar 21, 2018
Content may be subject to copyright.
Ann. N.Y. Acad. Sci. ISSN 0077-8923
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES
Issue: Addiction Reviews
Associative learning mechanisms underpinning the
transition from recreational drug use to addiction
Lee Hogarth,1Bernard W. Balleine,2Laura H. Corbit,3and Simon Killcross1
1School of Psychology, University of New South Wales, Sydney, Australia. 2Brain and Mind Research Institute, University of
Sydney, Sydney, Australia. 3School of Psychology, University of Sydney, Sydney, Australia
Address for correspondence: Lee Hogarth, School of Psychology, University of New South Wales, Sydney, NSW 2052,
Australia. l.hogarth@unsw.edu.au
Learning theory proposes that drug seeking is a synthesis of multiple controllers. Whereas goal-directed drug seeking
is determined by the anticipated incentive value of the drug, habitual drug seeking is elicited by stimuli that have
formed a direct association with the response. Moreover, drug-paired stimuli can transfer control over separately
trained drug seeking responses by retrieving an expectation of the drug’s identity (specific transfer) or incentive
value (general transfer). This review covers outcome devaluation and transfer of stimulus-control procedures in
humans and animals, which isolate the differential governance of drug seeking by these four controllers following
various degrees of contingent and noncontingent drug exposure. The neural mechanisms underpinning these four
controllers are also reviewed. These studies suggest that although initial drug seeking is goal-directed, chronic drug
exposure confers a progressive loss of control over action selection by specific outcome representations (impaired
outcome devaluation and specific transfer), and a concomitant increase in control over action selection by antecedent
stimuli (enhanced habit and general transfer). The prefrontal cortex and mediodorsal thalamus may play a role in
this drug-induced transition to behavioral autonomy.
Keywords: addiction; learning theory; goal; cue-reactivity; habit
Introduction
A recurrent theme in addiction theory is that drug
seeking has multiple determinants. Wikler1argued
that the euphoric effects of the drug maintained
initial drug use, whereas addiction itself stemmed
from the emergence of a withdrawal syndrome.
Tolerance2and opponent-process theories3elabo-
rated this notion of a shift from positive to neg-
ative reinforcement. Subsequently, the importance
of negative reinforcement was questioned by the
observation that drug self-administration engages
dopamine, the brain substrate of reward,4and by the
lawful relationship between the frequency of drug
seeking and the magnitude of drug reward.5But
denying the importance of negative reinforcement
(but see Ref. 6) put positive reinforcement theorists
at pains to explain the transition from recreational
drug use to addiction. Tiffany7answered this ques-
tion from a cognitive viewpoint, arguing that drug
seeking may be mediated by desire, or elicited au-
tomatically by drug cues, and the latter controller
predominates in addiction. Robinson and Berridge8
made a similar argument from a behavioral neuro-
science perspective, stating that drug seeking may be
driven by hedonic anticipation of the drug (liking),
or autonomous cue-locked conditioned behavior
(wanting), thus accounting for addicts’ paradoxi-
cal continuation of drug use despite intentions to
quit.
Contemporary addiction theories have elabo-
rated these themes. The behavioral economists have
garnered evidence that human drug dependence
is a choice recruited by the reinforcement value
of the drug,9but is also accompanied by an in-
ability to use knowledge of abstract future con-
sequences in decision making.10 Similarly, animal
learning theorists have substantiated evidence that
drug self-administration is a function of the rein-
forcement value of the drug11,12 but also undergoes a
doi: 10.1111/j.1749-6632.2012.06768.x
12 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
transition to automatic control by drug-paired stim-
uli.13 Finally, cognitive neuroscientists have shown
that drug liking is associated with drug-induced
dopamine activation14 and that clinically diagnosed
addiction is accompanied by hypofrontality and ex-
ecutive dysfunction.15 The common theme in all
of these frameworks, therefore, is that initial drug
use is mediated by the drug acting as a positive rein-
forcer, whereas the transition to clinical dependence
is linked to a loss of intentional regulation and con-
comitant emergence of automatic control over drug
seeking.
Learning theory and addiction
The current review aims to detail this transitional
theory of addiction by inspecting human and ani-
mal learning research that has tested the differential
governance of behavior at various stages of drug
exposure. The ideas developed here were first in-
troduced by Norman White who drew a link be-
tween the role of the striatum in memory and ad-
dictive behavior.16,17 The formal associative learning
account was then outlined by Anthony Dickin-
son during symposium proceedings from empiri-
cal work with natural rewards.18 These ideas were
then translated to behavioral neuroscience research
with addictive drugs in collaboration with Trevor
Robbins and Barry Everitt.19–21 Simultaneously, be-
havioral neuroscience research continued with nat-
ural rewards that clarified the associative mecha-
nisms outlined here,22–24 and which are depicted
schematically in Figure 1. According to this per-
spective, experience of the drug outcome is encoded
separately in terms of its specific sensory correlates
or perceptual identity (Oi) and its consummatory,
postingestive or incentive value (Ov), and these two
representations of the drug can differentially enter
into associations.25,26 As a consequence, the agent
(person or animal) acquires four forms of associa-
tive knowledge.
(1) Goal-directed learning. The agent acquires
knowledge of the instrumental contingency
between the drug seeking response and the
drug’s identity and value (R–Oiv). Moreover,
the representation of the drug’s value is up-
dated by internal states, such as deprivation
or satiety, which predict the current value of
the drug. Consequently, retrieval of the rep-
resentation of the drug and its current value
Figure 1. Experience of the drug outcome is separately en-
coded in terms of its perceptual identity (Oi) and incentive
value (Ov) and establishes learning about (1) the goal-directed
instrumental contingency between drug seeking response and
the drug (R–Oiv); (2) the habitual contingency between drug
stimuli and the drug seeking response (S–R); and (3) the Pavlo-
vian contingency between drug stimuli and the drug (S–Oiv ).
It is argued that chronic drug exposure generates a progressive
impairment in capacity to retrieve or utilize the specific iden-
tity of outcomes (Oi), which causes a transition in behavioral
control from the R–Oiv and S–Oiassociations to the S–R and
S–Ovassociations. That is, addiction reflects a loss of control
over behavior by knowledge of the consequences indexed by
outcome devaluation and specific transfer, in favor of control
by antecedent stimuli indexed by devaluation insensitivity and
general transfer.
(Oiv) determines the propensity to select the
associated drug seeking responses from among
competing outcome choices based on a com-
parison of their relative values.27 Thus,ahigher
value drug produces a greater proportion of in-
tentional choice of that outcome from among
alternative rewards.28
(2) Habit learning. The agent forms an associ-
ation between external stimuli (S) and the
drug seeking response (R) in proportion to
the contingent co-occurrence of these two
events before drug reinforcement and the re-
inforcement value of that outcome (Ov).29
This S–R/reinforcement process enables the
drug stimulus, when reencountered, to elicit
the drug seeking response directly without re-
trieving any representation of the drug out-
come. Such habitual drug seeking accords with
the clinical characterization of addiction as
reflecting a loss of intentional regulation of
behavior.
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 13
Abnormal learning underpinning dependence Hogarth et al.
(3) Specific transfer. External stimuli also acquire
an association with the drug outcome in accor-
dance with the predictive contingency between
these events, enabling stimuli to retrieve a rep-
resentation of the drug’s identity and/or value.
Retrieval of the outcome’s identity (S–Oi)can,
in turn, elicit separately trained instrumental
responses that are associated with that same
outcome via a bidirectional O–R, or ideomo-
tor, connection (S–Oi–R).30
(4) General transfer. By contrast, retrieval of the
outcome’s affective value (S–Ov) elicits a mo-
tivational state akin to the drug itself, which
exerts a general excitatory effect on prevailing
responses controlled by the other associations
([S–Ov]–R).31
The claim made in this paper is that these var-
ious forms of behavioral control interact to deter-
mine the propensity to engage in drug seeking at any
given moment. Our claim is that continuing drug
exposure impairs retrieval or utilization of the rep-
resentation of specific outcome identities (Oi), thus
impairing control of action by knowledge of specific
outcomes (R–Oiv and S–Oi–R) toward more general
control over actions by antecedent stimuli (S–R and
[S–Ov]–R). We now turn to empirical evidence for
this psychological account of addiction.
Goal-directed drug seeking
The outcome devaluation procedure provides the
principal method for identifying goal-directed con-
trol.32 A version of this procedure is presented in
Table 1. In this procedure, rats learn that two dif-
ferent lever press responses (R) produce different
rewarding outcomes (O). For example, one lever
may produce drug reward such as alcohol or cocaine
(O1), whereas the other lever produces an alterna-
tive natural reward such as sucrose (O2). The drug
is then devalued by pairing it with lithium chloride-
induced gastrointestinal sickness, specific satiety, or
related manipulation, such that the value of the drug
is diminished. The critical test then comes when
the animal is again given the opportunity to press
the two levers in an extinction test where the re-
sponses no longer produce their respective rewards.
The question at stake is whether the animal will re-
duce responding for the drug outcome (R1 <R2).
Because the outcomes are not presented in the ex-
tinction test, any such devaluation effect cannot be
attributed to S–R/reinforcement (habit) learning,
Tab l e 1. The outcome devaluation procedure used to
demonstrate goal-directed versus habitual control of ac-
tion selection. The agent learns that two responses (R1
and R2) earn different rewarding outcomes (O1 and O2).
One outcome is then devalued (O1 or O2) before an ex-
tinction test in which the agent can again perform the
two responses (R1/R2) without feedback from the out-
comes. A reduction in choice of the response that earned
the devalued outcome (R1=R2) must be goal-directed in
that it is controlled by knowledge of the R–O contingency
and the current incentive value of the O. By contrast, a
null effect of the devaluation treatment on responding in
the extinction test (R1 =R2) suggests responding is ha-
bitual in that it is elicited directly by the stimulus context
without engaging knowledge of the consequences (S–R)
Instrumental Devaluation Extinction
training treatment test
R1–O1 O1 or O2 R1/R2
R2–O2
that is, by experience of the drug outcome modu-
lating the capacity of contextual cues to elicit drug
seeking response. Furthermore, because the proce-
dure contains no stimuli that differentially signal
the two outcomes, a devaluation effect cannot be
attributed to a change in capacity of such cues to
elicit responding for their associated outcomes (S–
Oiv–R). Instead, a reduction in drug choice in the
extinction test must be mediated by animals’ in-
tegration of knowledge of the R–Oiv contingencies
acquired during instrumental training, with knowl-
edge of current low value of the drug outcome (Ov)
acquired during the devaluation treatment, which
together determine the propensity to select that re-
sponse. In other words, a devaluation effect in the
extinction test demonstrates that drug seeking is
goal-directed in that it is determined by the antici-
pated reward value of the drug.
Two studies illustrate the outcome devaluation
procedure in demonstrating goal-directed control
of drug seeking. In a study by Olmstead et al.,33 rats
were trained on a seeking–taking chain in which
they had to press a seeking lever to gain access to
a taking lever, which in turn delivered intravenous
cocaine. To test whether the seeking response was
goal-directed, the taking lever was extinguished by
terminating cocaine delivery. The seeking lever was
14 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
not present during this extinction training. The fact
that this extinction training led to an immediate re-
duction in rats’ performance of the seeking response
in extinction indicated that this response was medi-
ated by knowledge of its consequences, that is, the
low current value of the taking lever.
Hutcheson et al.34 employed a similar design.
Training on a seeking–taking chain for heroin was
followed by a revaluation treatment in which self-
administration via the taking response was expe-
rienced in a withdrawal state to establish the high
value of heroin in this state. Rats were then again
given access to the seeking lever in extinction, and
the finding that withdrawal produced an increase
in performance of the seeking response indicated
that it was goal-directed in that is was mediated by
knowledge of the current high value of the heroin
outcome.
The outcome devaluation procedure has also
been modified for humans.35,36 In the concurrent
training stage of these experiments, smokers learned
two key press responses, where R1 produced tobacco
points and R2 produced chocolate points. Tobacco
was then devalued by smoking to satiety, evaluation
of smoking health warnings, for example, “smok-
ing causes cancer,”36 or by administration of nico-
tine nasal spray.35 The finding that tobacco choice
in the extinction test was sensitive to these deval-
uation treatments (R1 <R2) indicated that it was
goal-directed in being mediated by knowledge of the
current value of the drug outcome.
A key observation replicated in these human ex-
periments was that individual variation in level of
tobacco dependence was associated with a prefer-
ential selection of the tobacco versus the chocolate
response. Similar preferences have been established
in animals11 and human cocaine users37 and con-
firms the economic theorists’ main contention that
drug dependence reflects individual differences in
the reinforcement value of the drug.9The outcome-
devaluation procedure qualifies this notion by dis-
tinguishing the contribution of goal-directed (R–
Oiv) and habitual (S–R) drug seeking to this drug
preference. We know that choice of the drug seek-
ing response was goal-directed as it was sensitive
to devaluation in the extinction test. Any resid-
ual contribution of S–R learning to this drug pref-
erence would be marked by variation in sensitiv-
ity to devaluation treatment in the extinction test.
As there was no systematic variation across levels
of nicotine dependence in sensitivity to devalua-
tion, it may be concluded that preferential tobacco
choice across dependence level was mediated en-
tirely by valuation of the drug as a goal, and not by
differential S–R formation. The conclusion, there-
fore, is that drug seeking within these parameters
is goal-directed, and that level of dependence, at
least at this early stage of drug exposure, reflects the
valuation of the drug as a specific goal (see Refs.
38–42).
Habitual drug seeking
As noted, the outcome devaluation procedure can
evaluate the habitual status of instrumental perfor-
mance32 (see Table 1). Whereas sensitivity of drug
seeking to devaluation in the extinction test (R1 <
R2) signifies goal-directed control, insensitivity to
devaluation in the extinction test (R1 =R2) demon-
strates that retrieval of the current value of the drug
plays no role in drug seeking. Instead, drug seeking
is deemed to have become habitual, being elicited by
contextual stimuli that have acquired a direct S–R
association with drug seeking during instrumental
training, without retrieving a representation of cur-
rent value of the drug.
Two studies illustrate the use of the outcome-
devaluation procedure to demonstrate the habitual
status of drug seeking. In the first study, Dickinson
et al.13 trained rats to acquire two instrumental re-
sponses, one for alcohol and one for food pellets,
before one of these outcomes was devalued by pair-
ing it with lithium chloride-induced sickness. When
the rats were again given the opportunity to respond
for these outcomes in extinction, it was found that
performance of the food-seeking response was re-
duced by the devaluation treatment, indicating that
food seeking was goal-directed. By contrast, perfor-
mance of the alcohol-seeking response was insensi-
tive to devaluation, suggesting that alcohol seeking
had become an S–R habit. A second study used a
similar design to confirm that cocaine seeking was
similarly prone to habitual control compared to nat-
ural reward seeking.43
A question arises as to why habitual drug seek-
ing was established by these two procedures,13,43
whereas goal-directed drug seeking was found in the
earlier designs.33–36 In explaining these divergent re-
sults, one might appeal to a number of variables that
have been demonstrated to modulate the balanceb e-
tween goal-directed and habitual control, including
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 15
Abnormal learning underpinning dependence Hogarth et al.
position of the response within an instrumental se-
quence or chain,44–46 amount of training,47,48 num-
ber of available responses,49 and/or reinforcement
value of the outcome.50 The important point made
by this literature is that goal-directed and habitual
actions exist in a dynamic balance that can be bi-
ased in one direction or the other by conditions of
training or testing that favor acquisition/expression
of the R–O versus S–R association. Our basic argu-
ment is that within this complex system, drugs exert
a constant pressure in favor of the S–R association
by impairing retrieval or utilization of the specific
identity of outcomes.
Corbit et al.51 has recently mapped the progres-
sive transition to habitual control of drug seeking
with extended training. In this study, rats acquired
a self-administration response for alcohol, before
alcohol was devalued by ad libitum consumption
(satiety). Alcohol seeking was then tested in extinc-
tion to evaluate goal-directed control of this be-
havior. Following two weeks of self-administration
training, the response remained sensitive to deval-
uation, but by eight weeks of training, the response
was insensitive to devaluation, suggesting a transi-
tion from goal-directed to habitual control had oc-
curred with training (cocaine seeking shows a sim-
ilar transition to habit with extended training52).
An important additional finding of this study was
that noncontingent administration of alcohol was
sufficient to accelerate habitual control over natural
reward-seeking responses. Thus, not only do drug
self-administration responses become habitual, but
also noncontingent drug exposure renders contem-
poraneously acquired naturally rewarded instru-
mental actions habitual.
In humans, a comparable effect of noncontin-
gent alcohol exposure on habitual control has re-
cently been demonstrated.53 Participants were ad-
ministered with 0.4 g/kg of alcohol or placebo before
instrumental training with R1 and R2 for chocolate
and water,respectively. Chocolate was then devalued
by ad libitum consumption before choice between
R1 and R2 was tested in extinction. The finding
that alcohol attenuated goal-directed control over
chocolate choice in the extinction test supports the
translational relevance of animal models, and sug-
gests that accelerated habit learning can be demon-
strated with acute drug dosing.
A key study by Nelson and Killcross54 re-
vealed that noncontingent drug exposure enhanced
habit formation during instrumental training rather
than at the extinction test. They preexposed rats
to amphetamine for seven days before a seven-
day injection-free period. Instrumental training
for sucrose was then undertaken before this out-
come was devalued by specific satiety or lithium
chloride–induced sickness. The results from the
extinction test indicated that both devaluation
treatments failed to modify sucrose seeking in
the amphetamine-exposed rats, suggesting this re-
sponse had become habitual, whereas placebo rats
showed goal-directed control (see also Refs. 50 and
55). Importantly, chronic amphetamine only accel-
erated habit formation if administered before in-
strumental training, but not if administered after
training. Consistent with this, all the aforemen-
tioned studies that have shown effects of contin-
gent13,43,51 and noncontingent51 drug exposure on
habit learning have undertaken drug administration
contemporaneously with instrumental training.
The implication, therefore, is that during instru-
mental training, the ability of the outcome represen-
tation to enter into new learning may be impaired by
drug exposure, favoring acquisition of the S–R over
the R–Oiv contingencies, but once R–Oiv learning is
acquired drug free, deployment of this knowledge
at test is not impaired by drug exposure.
In reconciling the aforementioned studies, one
can propose a transitional model of addiction
wherein initial drug seeking is goal-directed,33–36
but following extended training comes under ha-
bitual S–R control,13,43 and contemporaneously ac-
quired natural reward seeking also comes under
habitual control.51,53,54 Ultimately, the agent’s be-
havioral repertoire comes to be dominated by S–R
habits.
Specific transfer of stimulus control over drug
seeking
The Pavlovian to instrumental transfer procedure
is the principal method for demonstrating control
over responding by stimuli retrieving a representa-
tion of the specific identity of their paired outcome
(Oi) (e.g., Ref. 56; see Table 2). In this design, rats
are given Pavlovian training in which one stimulus
(S1) signals drug availability (O1), whereas a sec-
ond stimulus (S2) signals the availability of an alter-
native reward, for example sucrose (O2). Separate
instrumental training is then undertaken wherein
rats learn that one lever produces the drug (O1)
16 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
Tab l e 2. The Pavlovian to instrumental transfer proce-
dure used to demonstrate transfer of stimulus control
over action selection. The agent learns that two stimuli
(S1 and S2) predict different rewarding outcomes (O1
and O2). They separately learn that two responses (R1
and R2) differently earn those same rewarding outcomes
(O1 and O2). In the transfer test, the stimuli (S1 and S2)
are presented whilst the responses (R1 and R2) are avail-
able. A specific transfer effect is demonstrated when each
stimulus selectively enhances responding for the same
outcome (S1:R1 >R2, S2:R1 <R2). This specific transfer
effect suggests that each stimulus elicited a representa-
tion of the identity of its associated outcome (Oi), which
in turn elicited its associated response (S–Oi–R). By con-
trast, a general transfer effect is demonstrated when each
stimulus enhances both responses equally above a pre-
or no-stimulus baseline (S1/S2: >R1/R2). This general
transfer effect suggests that each stimulus elicited a rep-
resentation of the value (Ov) but not identity (Oi)ofits
associated outcome which produced a general enhance-
ment of responding ([S–Ov]–R)
Pavlovian Instrumental Extinction
training training test
S1–O1 R1–O1 S1:R1/R2
S2–O2 R2–O2 S2:R1/R2
whereas the other lever produces sucrose (O2). Fi-
nally, in the Pavlovian to instrumental transfer test,
each stimulus is presented for the first time while
the two instrumental responses are available in ex-
tinction. The question at stake is whether each stim-
ulus will enhance performance of the response with
which it shares the same outcome (i.e., S1:R1 >
R2, S2:R1 <R2). Such an outcome-specific trans-
fer effect demonstrates that each stimulus retrieved
a representation of its associated outcome, which
in turn retrieved the response that was associated
with that outcome (S–Oi–R). The effect cannot be
attributed to the formation of an S–R association
because the stimuli and the responses were never
contingently reinforced during training, and fur-
thermore, because the transfer test was conducted
in extinction, so no S–R association can form across
that period either.
There is currently only one demonstration of
outcome-specific transfer of stimulus control over
drug seeking per se57 (although there are many
demonstrations in natural reward learning58). In
this study, smokers first learned that two arbitrary
stimuli (S1 and S2) predicted tobacco points or
money, respectively, before learning that two re-
sponses (R1 and R2) earned tobacco points and
money, respectively. In the transfer test, the two
stimuli were found to selectively enhance perfor-
mance of the response that had earned the same
outcome. Thus, each stimulus must have retrieved
a representation of its associated outcome (points),
which in turn elicited the response that had pro-
duced that outcome (S–Oi–R).
General transfer of stimulus control over drug
seeking
By contrast, in a related animal procedure, Cor-
bit and Janak59 paired S1 and S2 with ethanol or
sucrose, respectively, and then trained R1 and R2
with these same outcomes, respectively. The results
showed that the ethanol stimulus enhanced the rate
of both R1 and R2 equally above a no-stimulus base-
line, indicating that this stimulus exerted a general
excitatory effect on instrumental reward seeking by
retrieving the value (S–Ov) rather than identity (S–
Oi) of the outcome. By contrast, the sucrose stimu-
lus produced a specific transfer effect, selectively en-
hancing instrumental responding for sucrose over
ethanol, indicating that it had retrieved the out-
come’s identity (S–Oi). These data are consistent
with the view that drug-associated cues favor gen-
eral facilitatory effects on appetitive instrumental
responses compared to natural reward-paired cues
(see also Refs. 60 and 61).
The divergent results of these human and an-
imal transfer studies may be resolved by ap-
pealing to Konorski’s view25 that outcomes are
encoded separately in terms of their perceptual iden-
tity (sensory correlates) and consummatory or in-
centive value.26 On this view, the tobacco points
outcome used by Hogarth et al.57 was largely per-
ceptual and minimally consummatory, and so the
stimulus paired with this outcome favored a specific
transfer effect that relied on the retrieval of this out-
come’s perceptual identity (S–Oi–R). By contrast
the ethanol consummatory outcome employed by
Corbit and Janak59 possessed a substantial pharma-
cological/consummatory effect, and so the stimulus
paired with this event favored a general motivational
enhancement based upon retrieval of the outcome
value ([S–Ov]–R).
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 17
Abnormal learning underpinning dependence Hogarth et al.
Other studies substantiate this characterization of
the specific and general forms of stimulus control.62
First, the magnitude of the specific transfer effect is
determined by the reliability of the S–O contingency
in training,63–65 but is insensitive to outcome deval-
uation.31,66–68 Importantly, specific transfer effects
by drug cues on drug seeking in humans are similarly
insensitive to devaluation achieved by drug sati-
ety, health warnings,36,69 and pharmacotherapy.35
Moreover, the finding that drug cue effects on sub-
jective craving70,71 and drug taking69 are similarly
autonomous of devaluation by satiety and pharma-
cotherapy, supports the validity of specific (S–Oi–
R) transfer effects in addiction. By contrast, general
transfer effects are modulated by devaluation of the
outcome,31,72 and cross over to other reinforcers of
the same hedonic category.73 Thus, general transfer
effects are deemed to be mediated by the stimu-
lus retrieving a representation of the current value
(Ov) but not identity (Oi) of the outcome, and as
a consequence, the effect is sensitive to changes in
motivational state but is not response selective ([S–
Ov]–R).
Not only does contingent drug exposure cause
drug cues to favor general over specific transfer,59
but noncontingent drug exposure may also cause
natural reward cues to undergo this same transi-
tion. In a recent study, Shiflett et al.74 found that
noncontingent exposure to chronic amphetamine
administered following Pavlovian and instrumen-
tal training (i.e., before the transfer test) abolished
the specific transfer effect and enhanced the general
transfer effect. Specifically, rats received Pavlovian
training in which S1 predicted chocolate and S2
predicted grain. Instrumental training was then un-
dertaken in which two responses, R1 and R2, earned
these same outcomes, respectively. Then, half of rats
were given seven days of amphetamine administra-
tions and the remainder placebo (similar to Ref.
54). Finally, in the transfer test, the two stimuli
were tested for a specific transfer effect in which
each stimulus selectively enhanced responding for
the same outcome, or a general transfer effect in
which each stimulus enhanced responding for both
outcomes equally above a prestimulus baseline. The
remarkable finding was that amphetamine expo-
sure before test abolished the specific transfer effect
and enhanced the general transfer effect. A simi-
lar enhancement of the general transfer effect pro-
duced by natural reward cues on reward seeking has
been found following acute75 and chronic76 am-
phetamine administered before the testing phase,
although in these latter studies no attempt was made
to assess specific transfer.
Overall, these studies favor a transitional model
wherein early in training, drug cues retrieve the
drug’s identity and thus produce specific trans-
fer effects.35,36,57 Extended drug exposure, how-
ever, causes stimuli to lose contact with the drug’s
identity and instead make contact with the drug’s
value, thus causing a transition from specific to
general transfer.59–61 Moreover, contemporaneously
acquired natural reward cues also shift contact
from their outcome’s identity to its value, caus-
ing a comparable transition from specific to general
transfer.61,74–76
Synthesis of psychological studies
The transition to behavioral autonomy depicted
across the studies reported here is consistent with
a singular impairment in the capacity to retrieve or
utilize the specific identity of outcomes as a conse-
quence of drug exposure. This impairment can ex-
plain why drug seeking is initially goal-directed (R–
Oiv) and under specific stimulus control (S–Oi–R),
but then becomes habitual (S–R) and under general
stimulus control ([S–Ov]–R). Whereas the former
two controllers require a representation of the spe-
cific identity of the drug, the latter two controllers
do not. Moreover, the finding of the same transition
in natural reward-seeking responses acquired con-
temporaneously with drug exposure suggests that
the impairment in capacity to represent the specific
outcomes applies to the entire class of appetitive
rewards (it remains to be seen whether aversive out-
comes are similarly affected). Finally, this account
suggests that stress,77 trait impulsivity,78,79 con-
flict,80,81 hypofrontality,82–84 and schizophrenia85,86
may be linked with drug dependence and relapse be-
cause they exacerbate this impairment in capacity to
represent/utilize specific outcome identities.
The claim that a single impairment underpins
both the loss of goal-directed control and the loss of
specific transfer is challenged by a dissociation be-
tween these two effects. Specifically, goal-directed
control was abolished by chronic amphetamine ad-
ministered before training but not administered
before test, suggesting that chronic amphetamine
impairs acquisition of response-outcome knowl-
edge during instrumental training but does not
18 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
directly impair the retrieval/utilization of outcome
identity required for goal-direction control at test.54
By contrast, chronic amphetamine abolished spe-
cific transfer when administered before test,74 sug-
gesting that chronic amphetamine can directly abol-
ish the retrieval/utilization of outcome identity re-
quired for the specific transfer effect. Identifying a
common learning mechanism that operates during
both instrumental training and the transfer test to
produce the observed transition to behavioral au-
tonomy is arguably crucial for isolating the core
associative pathology in addiction.
Neural basis of action control
The following section reviews animal studies that
have examined the neural basis of the four con-
trollers underpinning natural reward and drug seek-
ing. The purpose of this section is to identify sub-
strates upon which chronic drug exposure might
act to produce the transition to autonomy depicted
earlier, that is, reduce goal-directed learning and
specific transfer, and/or enhance habit learning and
general transfer.
Neural basis of goal-directed action
Lesions of the prelimbic (PL) region of the pre-
frontal cortex have been shown to produce precisely
the same deficit in goal-directed control as chronic
amphetamine.54 That is, lesions of the PL abolish
goal-directed control of natural reward seeking if
they occur before instrumental training48,87,88 but
not if they occur before test.89 Comparable effects
have been found following lesions of the mediodor-
sal thalamus, which also abolish acquisition90 but
not expression91 of goal-directed action. As the
mediodorsal thalamus provides the major thalamic
input to the PL, it is believed that these two regions
form a functional circuit. The correspondence of
PL, mediodorsal thalamic lesions, and chronic am-
phetamine exposure on acquisition of goal-directed
control supports these two brain regions in medi-
ating the effect of drug exposure on transition to
behavioral autonomy.
By contrast, the dorsomedial striatum (DMS)
has been shown to be essential for both the ac-
quisition92,93 and the expression94 of goal-directed
learning. Importantly, posttraining DMS inactiva-
tion has been shown to abolish goal-directed con-
trol of alcohol seeking, suggesting common neural
mechanisms underpinning both natural and drug
reward goal-directed learning.51 In addition, lesions
of the basolateral amygdala (BLA) abolish sensitivity
to outcome devaluation whether given before95,96
or after instrumental training.91,97 Thus the DMS
and BLA, in failing to mimic the selective effect
of amphetamine on loss of goal-directed learning
at acquisition, may not play a direct role in drug
sensitization-induced transition to behavioral au-
tonomy.
Neural basis of habitual action
Habitual action, by contrast, is mediated by the dor-
solateral striatum (DLS) and infralimbic cortex. As
noted earlier, overtraining instrumental contingen-
cies favors a transition from R–Oiv to S–R con-
trol, that is, progressive loss of sensitivity to de-
valuation in the extinction test.47 However, rats
with lesions to the DLS either pre- or posttrain-
ing fail to develop habitual control and remain
sensitive to devaluation irrespective of training,
indicating that the DLS is required for the acqui-
sition and expression habit learning.51,98,99 Impor-
tantly, posttraining inactivation of the DLS has also
been shown to abolish expression of habitual co-
caine seeking52 and alcohol seeking51 following ex-
tended training, rendering these behaviors once
again goal-directed, and confirming the common
neural mechanisms underpinning both natural and
drug reward habit learning. In addition, pretraining
functional disconnection of DLS and the amygdala
central nucleus (CN) has also recently been shown to
abolish habitual control of action and restore goal-
directed control.29 Finally, lesions of the infralim-
bic cortex made before instrumental training abol-
ish the transition to habit following overtraining.48
Thus, chronic drug exposure might act on any of
these regions to promote the dominance of S–R
habits.
Neural basis of specific transfer
of stimulus control
The ability of stimuli to transfer selective control
over separately trained instrumental responding
for the same outcome is abolished by pretrain-
ing62 and posttraining lesions of the orbitofrontal
cortex (OFC),100 by pretraining lesions and post-
training inactivation of the nucleus accumbens
(NAC) shell,101,102 by pretraining96,103,104 and post-
training91 lesions of the BLA, and pretrain-
ing functional disconnection between these two
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 19
Abnormal learning underpinning dependence Hogarth et al.
structures.105 In addition, posttraining inactiva-
tion of the DMS106 and posttraining lesions of the
mediodorsal thalamus91 also eliminate the selec-
tive transfer effect. Thus, to impair specific trans-
fer, chronic drug exposure may act on any of these
structures.
Neural basis of general transfer
of stimulus control
The ability of conditioned stimuli to produce a gen-
eral excitatory effect on separately trained instru-
mental responses is abolished by posttraining inac-
tivation of the DLS,106 posttraining inactivation of
the ventral tegmental area (VTA),31,107pre- or post-
training inactivation of the NAC core,102,108 andpre-
training lesions of the amygdala CN.96 Thus, chronic
drug exposure may influence any of these regions to
enhance general transfer effects.
Synthesis of behavioral neuroscience
studies
PL and medial dorsal thalamic lesions show a strik-
ing correspondence with chronic drug exposure
in producing behavioral autonomy. Specifically,
these lesions abolish acquisition but not expression
of goal-directed control,48,87–91 which matches ex-
actly the effect of chronic amphetamine54 (see also
Ref. 51 for related effect with alcohol). However, le-
sions to the PL do not modify the specific transfer
effect,88 but posttraining lesions of the mediodorsal
thalamus do,91 matching the impact of chronic am-
phetamine.74 Thus, lesions of the mediodorsal tha-
lamus produce precisely the same effect as chronic
drug exposure. It is also noteworthy that lesions
of the OFC abolish specific transfer,62,100 but not
outcome devaluation,100 indicating that damage at
this region alone could not produce the exact pat-
tern of chronic drug exposure. Thus, although the
effect of chronic drug exposure on transition to be-
havioral autonomy could be produced by a combi-
nation of PL and OFC damage—a view strength-
ened by the observation of hypofunction in these
regions in addicts109,110—damage to the mediodor-
sal thalamus alone could impair both forms of
behavior control, and so has the advantage of
parsimony.
Conclusions
To conclude, we propose that initial drug seeking
is goal-directed, tracks the anticipated value of the
drug (R–Oiv), and is responsive to specific transfer
(S–Oi–R) effects by drug cues. Chronic drug expo-
sure, however, impairs the capacity to retrieve or
utilization of the specific identity of outcomes, and
so produces a transition of behavioral control from
goal-directed learning (R–Oiv) and specific transfer
(S–Oi–R) to habit (S–R) and general transfer ([S–
Ov]–R). This transition occurs in relation to both
drug outcomes and natural reward outcome, result-
ing in a narrowing of the addict’s behavioral reper-
toire to general cue excitation of dominant S–R drug
habits, with restricted capacity for intentional selec-
tion of alternative actions. This associative frame-
work captures the cardinal diagnostic characteristics
of heightened drug reinforcement, loss of willed reg-
ulation of drug seeking, and restricted engagement
with alternative activities. Future research needs to
clarify precisely how this transition to autonomy is
accelerated by drugs of abuse compared to natural
rewards, whether by differences in reward value,50
kinetics,111 neuroadaptations,112,113 or neurotoxic-
ity114 and precisely how this alters the balance be-
tween corticostriatal circuits underpinning the four
controllers.48,115
There are several implications concerning treat-
ment strategy. Consistent with Tiffany’s7insight, we
have argued that goal-directed action and habit ex-
ist in a dynamic balance which may be additive,32
competitive,46 or hierarchical,45 but switching be-
tween the two modes apparently can occur within
the span of a single response sequence and/or lon-
gitudinally with training. If addiction does reflect
a progressive weakening of the role of outcome
retrieval/utilization in the execution of action se-
quences, allowing drug habits to dominate, then
treatments such as expectancy challenge116 and ex-
tinction training,117 which arguably work by chang-
ing the specific representation of the drug may not
provide the optimal strategy. Instead, treatments
that enhance the capacity to engage representations
of the future such as working memory training118
combined with provisions of alternative reward con-
tingencies119 may be more efficacious in redirect-
ing addicts from their established habits. Moreover,
given that capacity for goal-directed control can be
reinstated by manipulations of brain function48,51,52
and by uncertainty, which has definable neural sub-
strates,46 suggests that neuropharmacology could
complement such learning approaches to install new
intentional action choices.
20 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
Acknowledgments
This work was supported by MRC Grant #G0701456
to L.H., NHMRC Grant #633268 to B.B. and S.K.,
and NHMRC Grant #568872 to S.K. and B.B.
Conflicts of interest
The authors declare no conflicts of interest.
References
1. Wikler, A. 1984. Conditioning factors in opiate addiction
and relapse. J. Subst. Abuse Treat. 1: 279–285.
2. Siegel, S. 1989. Pharmacological conditioning and drug ef-
fects. In Psychoactive Drugs. Tolerance and Sensitisation.A.
Goudie & M. Emmett-Oglesby, Eds.: 115–180. Humana
Press. Clifton, New Jersey.
3. Solomon, R.L. & J.D. Corbit. 1974. An opponent-process
theory of motivation: I. Temporal dynamics of affect. Psy-
chol. Rev. 81: 119–145.
4. Stewart, J., H. de Wit & R. Eikelboom. 1984. Role
of conditioned and unconditioned drug effects in self-
administration of opiates and stimulants. Psychol. Rev. 63:
251–268.
5. Bickel, W.K. et al. 1991. Behavioral economics of drug self-
administration: II. A unit-price analysis of cigarette smok-
ing. J. Exp. Anal. Behav. 55: 145–154.
6. Koob, G.F. & M. Le Moal. 2001. Drug addiction, dysregu-
lation of reward, and allostasis. Neuropsychopharmacology
24: 97–129.
7. Tiffany, S.T. 1990. A cognitive model of drug urges and
drug-use behavior: role of automatic and nonautomatic
processes. Psychol. Re v. 97: 147–168.
8. Robinson, T.E. & K.C. Berridge. 1993. The neural basis
of drug craving: an incentive-sensitization theory of drug
addiction. Brain Res. Rev. 18: 247–291.
9. Heyman, G.M. 2009. Addiction: A disorder of Choice.Har-
vard University Press. Cambridge, MA.
10. MacKillop, J. et al. 2011. Delayed reward discounting and
addictive behavior: a meta-analysis. Psychopharmacology
216: 305–321.
11. Ahmed, S.H. 2010. Validation crisis in animal models of
drug addiction: beyond non-disordered drug use toward
drug addiction. Neu ro sci. Bi ob eh av. Rev. 35: 172–184.
12. Ahmed, S.H. & G.F. Koob. 1998. Transition from moderate
to excessivedrug intake: change in hedonic set point. Science
282: 298–300.
13. Dickinson, A., N. Wood & J.W.Smith. 2002. Alcohol seeking
by rats: action or habit? Q. J. Exp. Psychol. B. 55: 331–348.
14. Drevets, W.C. et al. 2001. Amphetamine-induced dopamine
release in human ventral striatum correlates with euphoria.
Biol. Psychiatry 49: 81–96.
15. Volkow, N.D., J.S. Fowler & G.J. Wang. 2004. The addicted
human brain viewed in the light of imaging studies: brain
circuits and treatment strategies. Neuropharm aco log y 47:
3–13.
16. White, N.M. 1989. A functional hypothesis concerning the
striatal matrix and patches: mediation of S-R memory and
reward. Life Sci. 45: 1943–1957.
17. White, N.M. 1996. Addictive drugs as reinforcers: multiple
partial actions on memory systems. Addiction 91: 921–950.
18. Altman, J. et al. 1996. The biological, social and clinical
bases of drug addiction: commentary and debate. Psy-
chopharmacology 125: 285–345.
19. Robbins, T.W. & B.J. Everitt. 1999. Drug addiction: bad
habits add up. Nature 398: 567–570.
20. Everitt, B.J., A. Dickinson & T.W. Robbins. 2001. The neu-
ropsychological basis of addictive behavior. Brain Res. Rev.
36: 129–138.
21. Everitt, B.J. & T.W. Robbins. 2005. Neural systems of re-
inforcement for drug addiction: from actions to habits to
compulsion. Nat. Neurosci. 8: 1481–1489.
22. Balleine, B.W. & S.B. Ostlund. 2007. Still at the choice-
point—action selection and initiation in instrumental con-
ditioning. Ann. N.Y. Acad. Sci.1104: 147–171.
23. de Wit, S. & A. Dickinson. 2009. Associativetheories of goal-
directed behavior: a case for animal–human translational
models. Psychol. Res. 73: 463–476.
24. Killcross, S. & P. Blundell. 2002. Associative representations
of emotionally significant outcomes. In Emotional Cogni-
tion: From Brain to Behavior. S.C.M.M. Oaksford, Ed.: 35–
73. John Benjamins Publishing Company. Amsterdam, the
Netherlands.
25. Konorski, J.1967. Integrative Activit y oft he Brain.University
of Chicago Press. Chicago.
26. Balleine, B.W. & S. Killcross. 2006. Parallel incentive pro-
cessing: an integrated view of amygdala function. Tre n d s
Neurosci. 29: 272–279.
27. Vlaev, I. et al. 2011. Does the brain calculate value? Trend s
Cogn. Sci. 15: 546–554.
28. Hursh, S.R. & A. Silberberg. 2008. Economic demand and
essential value. Psychol. Rev. 115: 186–198.
29. Lingawi, N.W. & B.W. Balleine. 2012. Amygdala central
nucleus interacts with dorsolateral striatum to regulate the
acquisition of habits. J. Neurosci. 32: 1073–1081.
30. Hommel, B. 2013. Ideomotor action control: On the per-
ceptual grounding of voluntary actions and agents. In W.
Prinz, M. Beisert & A. Herwig (Eds.), Action science: Foun-
dations of an emerging discipline (pp. 113–136). Cambridge,
MA: MIT Press.
31. Corbit, L.H., P.H. Janak & B.W. Balleine. 2007. General and
outcome-specific forms of Pavlovian-instrumentaltr ansfer:
the effect of shifts in motivational state and inactivation
of the ventral tegmental area. Eur. J. Neurosci. 26: 3141–
3149.
32. Dickinson, A. 1985. Actions and habits—the development
of behavioral autonomy. Philos. Trans. R. Soc. Lond. Ser. B
Biol. Sci. 308: 67–78.
33. Olmstead, M.C. et al. 2001. Cocaine seeking by rats is a
goal-directed action. Behav. Neurosci. 115: 394–402.
34. Hutcheson, D.M. et al. 2001. The role of withdrawal in
heroin addiction: enhances reward or promotes avoidance?
Nat. Neurosci. 4: 943–947.
35. Hogarth, L. 2012. Goal-directed and transfer-cue-elicited
drug-seeking are dissociated by pharmacotherapy: Evi-
dence for independent additive controllers. J. Exp. Psychol.
Anim. Behav. Processes .38: 266–278.
36. Hogarth, L. & H.W. Chase. 2011. Parallel goal-directed and
habitual control of human drug seeking: implications for
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 21
Abnormal learning underpinning dependence Hogarth et al.
dependence vulnerability. J. Exp. Psychol. Anim. Behav. Pro-
cess. 37: 261–276.
37. Moeller, S.J. et al. 2009. Enhanced choice for viewing co-
caine pictures in cocaine addiction. Biol. Psychiatry. 66:
169–176.
38. Fergusson, D.M. et al. 2003. Early reactions to cannabis
predict later dependence. Arch.Gen.Psychiatry60: 1033–
1039.
39. de Wit, H., E.H. Uhlenhuth & C.E. Johanson. 1986. Indi-
vidual differences in the reinforcing and subjective effects
of amphetamine and diazepam. Drug Alcohol Depend. 16:
341–360.
40. Scherrer, J.F. et al. 2009. Subjective effects to cannabis are
associated with use, abuse and dependence after adjust-
ing for genetic and environmental influences. Drug Alcohol
Depend. 105: 76–82.
41. Stoops, W.W. et al. 2007. The reinforcing, subject-rated,
performance, and cardiovascular effects of d-amphetamine:
influence of sensation-seeking status. Addict. Behav. 32:
1177–1188.
42. Pomerleau, O. 1995. Individual differences in sensitivity
to nicotine: implications for genetic research on nicotine
dependence. Behav. Genet. 25: 161–177.
43. Miles, F.J., B.J. Everitt & A. Dickinson. 2003. Oral cocaine
seeking by rats: action or habit? Behav. Neurosci. 117: 927–
938.
44. Balleine, B.W. et al. 1995. Motivational control of hetero-
geneous instrumental chains. J. Exp. Psychol. Anim. Behav.
Process. 21: 203–217.
45. Dezfouli, A. & B.W. Balleine. 2012. Habits, action sequences
and reinforcement learning. Eur. J. Neurosc i. 35: 1036–1051.
46. Daw, N.D., Y. Niv & P. Dayan. 2005. Uncertainty-based
competition between prefrontal and dorsolateral striatal
systems for behavioral control. Nat. Neurosci.8: 1704–1711.
47. Dickinson, A. et al. 1995. Motivational control after ex-
tended instrumental training. Anim.Learn.Behav.23: 197–
206.
48. Killcross, S. & E. Coutureau. 2003. Coordination of actions
and habits in the medial prefrontal cortex of rats. Cereb.
Cortex. 13: 400–408.
49. Kosaki, Y. & A. Dickinson. 2010. Choice and contingency
in the development of behavioral autonomy during instru-
mental conditioning. J. Exp. Psychol. Anim. Behav. Process.
36: 334–342.
50. Nordquist, R.E. et al. 2007. Augmented reinforcer value
and accelerated habit formation after repeated am-
phetamine treatment.Eur. Neuropsychopharmacol. 17: 532–
540.
51. Corbit, L.H., H. Nie & P.H. Janak. Habitual alcohol seeking:
time course and the contribution of subregions of the dorsal
striatum. Biol. Psychiatry.72: 389–395. In press.
52. Zapata, A., V.L. Minney & T.S. Shippenberg. 2010. Shift
from goal-directed to habitual cocaine seeking after pro-
longed experience in rats. J. Neurosci. 30: 15457–15463.
53. Hogarth, L. et al. 2012. Acute alcohol impairs human goal-
directed action. Biol. Psychol. 90: 154–160.
54. Nelson, A. & S. Killcross. 2006. Amphetamine exposure
enhances habit formation. J. Neurosci. 26: 3805–3812.
55. Schoenbaum, G. & B. Setlow. 2005. Cocaine makes actions
insensitive to outcomes but not extinction: implications for
altered orbitofrontal-amygdalarfunction. Ce reb. Cortex. 15:
1162–1169.
56. Colwill, R.M. & R.A. Rescorla. 1988. Associations between
the discriminative stimulus and the reinforcer in instru-
mental learning. J. Exp. Psychol. Anim. Behav. Process. 14:
155–164.
57. Hogarth, L. et al. 2007. The role of drug expectancy in the
control of human drug seeking. J. Exp. Psychol.Anim. Behav.
Process. 33: 484–496.
58. Holmes, N.M., A.R. Marchand & E. Coutureau. 2010.
Pavlovian to instrumental transfer: a neurobehavioral per-
spective. Neurosci. Biobehav. Rev. 34: 1277–1295.
59. Corbit, L.H. & P.H. Janak. 2007. Ethanol-associated cues
produce general Pavlovian-instrumental transfer. Alcohol.
Clin.Exp.Res.31: 766–774.
60. Krank, M.D. 2003. Pavlovian conditioning with ethanol:
sign-tracking (autoshaping), conditioned incentive, and
ethanol self-administration. Alcohol. Clin. Exp. Res. 27:
1592–1598.
61. Glasner, S.V., J.B. Overmier & B.W. Balleine. 2005. The
role of Pavlovian cues in alcohol seeking in dependent and
nondependent rats. J. Stud. Alcohol. 66: 53–61.
62. Balleine, B.W., B.K. Leung & S.B. Ostlund. 2011. The or-
bitofrontal cortex, predicted value, and choice. Ann. N. Y.
Acad. Sci. 1239: 43–50.
63. Delamater, A.R. 1995. Outcome-selective effects of inter-
trial reinforcement in Pavlovian appetitive conditioning
with rats. Anim.Learn.Behav.23: 31–39.
64. G´
amez, A.M. & J.M. Rosas. 2005. Transfer of stimulus con-
trol across instrumental responses is attenuated by extinc-
tion in human instrumental conditioning. Int. J. Psychol.
Psychol. Ther. 5: 207–222.
65. Trick, L., L. Hogarth & T. Duka. 2011. Prediction and
uncertainty in human Pavlovian to instrumental transfer.
J. Exp. Psychol. Learn. Mem. Cogn. 37: 757–765.
66. Rescorla, R.A. 1994. Transfer of instrumental control me-
diated by a devalued outcome. Anim.Learn.Behav.22: 27–
33.
67. Holland, P.C. 2004. Relations between Pavlovian-
instrumental transfer and reinforcer devaluation. J. Exp.
Psychol. Anim. Behav. Process. 30: 258–258.
68. Colwill, R.M. & R.A. Rescorla. 1990. Effects of reinforcer
devaluation on discriminative control of instrumental be-
havior. J. Exp. Psychol. Anim. Behav. Process. 16: 40–47.
69. Hogarth, L., A. Dickinson & T. Duka. 2010. The associative
basis of cue elicited drug taking in humans. Psychopharma-
cology 208: 337–351.
70. Ferguson, S.G. & S. Shiffman. 2009. The relevance and
treatment of cue-induced cravings in tobacco dependence.
J. Subst. Abuse Treat. 36: 235–243.
71. Hitsman, B. et al. Dissociable effect of acute varenicline
on tonic versus cue-provoked craving in non-treatment
motivated heavy smokers. Drug Alcohol Depend.Inpress.
72. Dickinson, A. & G.R. Dawson. 1987. Pavlovian processes
in the motivational control of instrumental performance.
Q. J. Exp. Psychol. B 39: 201–213.
73. Mitchell, J.B. & J. Stewart. 1990. Facilitation of sexual be-
haviors in the male rat in the presence of stimuli previously
paired with systemic injections of morphine. Pharmacol.
Biochem. Behav. 35: 367–372.
22 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
74. Shiflett, M. 2012. The effects of amphetamine exposure on
outcome-selective Pavlovian-instrumental transfer in rats.
Psychopharmacology 223: 361–370.
75. Wyvell, C.L. & K.C. Berridge. 2000. Intra-accumbens am-
phetamine increases the conditioned incentive salience of
sucrose reward: enhancement of reward “wanting” without
enhanced “liking” or response reinforcement. J. Neurosci.
20: 8122–8130.
76. Wyvell, C.L. & K.C. Berridge. 2001. Incentive sensitiza-
tion by previous amphetamine exposure: increased cue-
triggered “wanting” for sucrose reward. J. Neurosci. 21:
7831–7840.
77. Schwabe, L., A. Dickinson & O.T. Wolf. 2011. Stress, habits
and drug addiction: a psychoneuroendocrinological per-
spective. Exp. Clin. Psychopharmacol. 19: 53–63.
78. Hogarth, L. 2011. The role of impulsivity in the aetiology
of drug dependence: reward sensitivity versus automaticity.
Psychopharmacology 215: 567–580.
79. Hogarth, L., H.W. Chase & K. Baess. 2012. Impaired goal-
directed behavioral control in human impulsivity. Q. J. Exp.
Psychol. 65: 305–316.
80. de Wit, S. et al. 2006. Dorsomedial prefrontal cortex resolves
response conflict in rats. J. Neurosci. 26: 5224–5229.
81. Ostlund, S.B., N.T. Maidment & B.W. Balleine. 2010.
Alcohol-paired contextual cues produce an immediate and
selective loss of goal-directed action in rats. Front. Integr.
Neurosci. 4: 19.
82. Gillan, C.M. et al. 2011. Disruption in the balance between
goal-directed behavior and habit learning in obsessive-
compulsive disorder. Am.J.Psychiatry.168: 718–726.
83. Valentin, V., A. Dickinson & J.P. O’Doherty. 2007. Deter-
mining the neural substrates of goal-directed learning in
the human brain. J. Neurosci. 27: 4019–4026.
84. Klossek, U.M., J. Russell & A. Dickinson. 2008. The control
of instrumental action following outcome devaluation in
young children aged between 1 and 4 years. J. Exp. Psychol.
Gen. 137: 39–51.
85. Haddon, J.E. et al. 2010. Impaired conditional task perfor-
mance in a high schizotypy population: relation to cognitive
deficits. Q. J. Exp. Psychol. 64: 1–9.
86. Barch, D.M. & A. Ceaser. 2012. Cognition in schizophrenia:
core psychological and neural mechanisms. Tre n ds Co g n .
Sci. 16: 27–34.
87. Balleine, B.W. & A. Dickinson. 1998. Goal-directed in-
strumental action: contingency and incentive learning
and their cortical substrates. Neuropharm aco log y 37: 407–
419.
88. Corbit, L.H. & B.W. Balleine. 2003. The role of prelimbic
cortex in instrumental conditioning. Behav. Brain Res. 146:
145–157.
89. Ostlund, S.B. & B.W. Balleine. 2005. Lesions of medial pre-
frontal cortex disrupt the acquisition but not the expression
of goal-directed learning. J. Neurosci. 25: 7763–7770.
90. Corbit, L.H., J.L. Muir & B.W. Balleine. 2003. Lesions of
mediodorsal thalamus and anterior thalamic nuclei pro-
duce dissociable effects on instrumental conditioning in
rats. Eur. J. Neurosci. 18: 1286–1294.
91. Ostlund, S.B. & B.W. Balleine. 2008. Differential involve-
ment of the basolateral amygdala and mediodorsal thala-
mus in instrumental action selection. J. Neurosci. 28: 4398–
4405.
92. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2005. Blockade
of NMDA receptors in the dorsomedial striatum prevents
action-outcome learning in instrumental conditioning. Eur.
J. Neurosci. 22: 505–512.
93. Corbit, L.H. & P.H. Janak. 2010. Posterior dorsomedial
striatum is critical for both selective instrumental and
Pavlovian reward learning. Eur. J. Neurosci. 31: 1312–1321.
94. Yin, H.H. et al. 2005. The role of the dorsomedial striatum
in instrumental conditioning. Eur. J. Neurosci. 22: 513–523.
95. Balleine, B.W., A.S. Killcross & A. Dickinson. 2003. The
effect of lesions of the basolateral amygdala on instrumental
conditioning. J. Neurosci. 23: 666–675.
96. Corbit, L.H. & B.W. Balleine. 2005. Double dissociation of
basolateral and central amygdala lesions on the general and
outcome-specific forms of pavlovian-instrumental transfer.
J. Neurosci. 25: 962–970.
97. Johnson, A.W., M. Gallagher & P.C. Holland. 2009. The ba-
solateral amygdala is critical to the expression of Pavlovian
and instrumental outcome-specific reinforcer devaluation
effects. J. Neurosci. 29: 696–704.
98. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2004. Lesions
of dorsolateral striatum preserve outcome expectancy but
disrupt habit formation in instrumental learning. Eur. J.
Neurosci. 19: 181–189.
99. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2006. Inactiva-
tion of dorsolateral striatum enhances sensitivity to changes
in the action-outcome contingency in instrumental condi-
tioning. Behav. Brain Res. 166: 189–196.
100. Ostlund, S.B. & B.W. Balleine. 2007. Orbitofrontal cortex
mediates outcome encoding in pavlovian but not instru-
mental conditioning. J. Neurosci. 27: 4819–4825.
101. Corbit, L.H., J.L. Muir& B.W. Balleine. 2001. The role of the
nucleus accumbens in instrumental conditioning: evidence
of a functional dissociation between accumbens core and
shell. J. Neurosci. 21: 3251–3260.
102. Corbit, L. & B. Balleine. 2011. The general and outcome-
specific forms of Pavlovian-Instrumental transfer are dif-
ferentially mediated by the nucleus accumbens core and
shell. JNeurosci.31: 11786–11794.
103. Blundell, P., G. Hall & S. Killcross. 2001. Lesions of the
basolateral amygdala disrupt selective aspects of reinforcer
representation in rats. J. Neurosci. 21: 9018–9026.
104. Holland, P.C. & M. Gallagher. 2003. Double dissociation of
the effects of lesions of basolateral and central amygdala on
conditioned stimulus-potentiated feeding and Pavlovian-
instrumental transfer. Eur. J. Neurosci. 17: 1680–1694.
105. Shiflett, M.W. & B.W. Balleine. 2010. At the limbic–motor
interface: disconnection of basolateral amygdala from nu-
cleus accumbens core and shell reveals dissociable compo-
nents of incentive motivation. Eur. J. Neurosci. 32: 1735–
1743.
106. Corbit, L.H. & P.H. Janak. 2007. Inactivation of the lateral
but not medial dorsal striatum eliminates the excitatory
impact of Pavlovian stimuli on instrumental responding.
J. Neurosci. 27: 13977–13981.
107. Murschall, A. & W. Hauber. 2006. Inactivation of the ventral
tegmental area abolished the general excitatory influence of
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 23
Abnormal learning underpinning dependence Hogarth et al.
Pavlovian cues on instrumental performance. Learn. Mem.
13: 123–126.
108. Hall, J. et al. 2001. Involvement of the central nucleus of
the and nucleus accumbens core in mediating Pavlovian
influences on instrumental behavior. Eur. J. Neurosci. 13:
1984–1992.
109. Chase, H.W. etal . 2008. The role of the orbitofrontal cortex
in human discrimination learning. Neuropsychologia 46:
1326–1337.
110. Wilson, S.J., M.A. Sayette & J.A. Fiez. 2004. Prefrontal re-
sponses to drug cues: a neurocognitive analysis. Nat. Neu-
rosci. 7: 211–214.
111. Farr´
e, M. & J. Cam´
ı. 1991. Pharmacokinetic considera-
tions in abuse liability evaluation. Addiction 86: 1601–
1606.
112. Wickens, J.R. et al. 2007. Dopaminergic mechanisms in
actions and habits. J. Neurosci. 27: 8181–8183.
113. Jedynak, J.P. et al . 2007. Methamphetamine-induced struc-
tural plasticity in the dorsal striatum. Eur. J. Neurosci. 25:
847–853.
114. Cunha-Oliveira, T., A.C. Rego & C.R. Oliueira. 2008. Cel-
lular and molecular mechanisms involved in the neurotox-
icity of opioid and psychostimulant drugs. Brain Res. Rev.
58: 192–208.
115. Balleine, B.W. & J.P. O’Doherty. 2010. Human and rodent
homologies in action control: corticostriatal determinants
of goal-directed and habitual action. Neuropsychopharma-
cology 35: 48–69.
116. Jones, B.T. & R.M. Young. 2011. Changing alcohol ex-
pectancies and self-efficacy expectations. In Handbook of
Motivational Counseling: Goal-Based Approaches to Assess-
ment and Intervention with Addiction and Other Problems.
W.M. Cox & E. Klinger, Eds.: 489–504. John Wiley & Sons,
Ltd.
117. Bouton, M.E. 2002. Context, ambiguity, and unlearning:
sources of relapseafter behavioral extinction. Biol Psychiatr y
52: 976–986.
118. Bickel, W.K. et al. 2011. Remember the future: working
memory training decreases delay discounting among stim-
ulant addicts. Biol. Psychiatry 69: 260–265.
119. Quick, S.L. et al. 2011. Loss of alternative non-drug rein-
forcement induces relapse of cocaine-seeking in rats: role
of dopamine D1 receptors. Neuropsychopharmacology 36:
1015–1020.
24 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.