ArticlePDF AvailableLiterature Review

Abstract

Learning theory proposes that drug seeking is a synthesis of multiple controllers. Whereas goal-directed drug seeking is determined by the anticipated incentive value of the drug, habitual drug seeking is elicited by stimuli that have formed a direct association with the response. Moreover, drug-paired stimuli can transfer control over separately trained drug seeking responses by retrieving an expectation of the drug's identity (specific transfer) or incentive value (general transfer). This review covers outcome devaluation and transfer of stimulus-control procedures in humans and animals, which isolate the differential governance of drug seeking by these four controllers following various degrees of contingent and noncontingent drug exposure. The neural mechanisms underpinning these four controllers are also reviewed. These studies suggest that although initial drug seeking is goal-directed, chronic drug exposure confers a progressive loss of control over action selection by specific outcome representations (impaired outcome devaluation and specific transfer), and a concomitant increase in control over action selection by antecedent stimuli (enhanced habit and general transfer). The prefrontal cortex and mediodorsal thalamus may play a role in this drug-induced transition to behavioral autonomy.
Ann. N.Y. Acad. Sci. ISSN 0077-8923
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES
Issue: Addiction Reviews
Associative learning mechanisms underpinning the
transition from recreational drug use to addiction
Lee Hogarth,1Bernard W. Balleine,2Laura H. Corbit,3and Simon Killcross1
1School of Psychology, University of New South Wales, Sydney, Australia. 2Brain and Mind Research Institute, University of
Sydney, Sydney, Australia. 3School of Psychology, University of Sydney, Sydney, Australia
Address for correspondence: Lee Hogarth, School of Psychology, University of New South Wales, Sydney, NSW 2052,
Australia. l.hogarth@unsw.edu.au
Learning theory proposes that drug seeking is a synthesis of multiple controllers. Whereas goal-directed drug seeking
is determined by the anticipated incentive value of the drug, habitual drug seeking is elicited by stimuli that have
formed a direct association with the response. Moreover, drug-paired stimuli can transfer control over separately
trained drug seeking responses by retrieving an expectation of the drug’s identity (specific transfer) or incentive
value (general transfer). This review covers outcome devaluation and transfer of stimulus-control procedures in
humans and animals, which isolate the differential governance of drug seeking by these four controllers following
various degrees of contingent and noncontingent drug exposure. The neural mechanisms underpinning these four
controllers are also reviewed. These studies suggest that although initial drug seeking is goal-directed, chronic drug
exposure confers a progressive loss of control over action selection by specific outcome representations (impaired
outcome devaluation and specific transfer), and a concomitant increase in control over action selection by antecedent
stimuli (enhanced habit and general transfer). The prefrontal cortex and mediodorsal thalamus may play a role in
this drug-induced transition to behavioral autonomy.
Keywords: addiction; learning theory; goal; cue-reactivity; habit
Introduction
A recurrent theme in addiction theory is that drug
seeking has multiple determinants. Wikler1argued
that the euphoric effects of the drug maintained
initial drug use, whereas addiction itself stemmed
from the emergence of a withdrawal syndrome.
Tolerance2and opponent-process theories3elabo-
rated this notion of a shift from positive to neg-
ative reinforcement. Subsequently, the importance
of negative reinforcement was questioned by the
observation that drug self-administration engages
dopamine, the brain substrate of reward,4and by the
lawful relationship between the frequency of drug
seeking and the magnitude of drug reward.5But
denying the importance of negative reinforcement
(but see Ref. 6) put positive reinforcement theorists
at pains to explain the transition from recreational
drug use to addiction. Tiffany7answered this ques-
tion from a cognitive viewpoint, arguing that drug
seeking may be mediated by desire, or elicited au-
tomatically by drug cues, and the latter controller
predominates in addiction. Robinson and Berridge8
made a similar argument from a behavioral neuro-
science perspective, stating that drug seeking may be
driven by hedonic anticipation of the drug (liking),
or autonomous cue-locked conditioned behavior
(wanting), thus accounting for addicts’ paradoxi-
cal continuation of drug use despite intentions to
quit.
Contemporary addiction theories have elabo-
rated these themes. The behavioral economists have
garnered evidence that human drug dependence
is a choice recruited by the reinforcement value
of the drug,9but is also accompanied by an in-
ability to use knowledge of abstract future con-
sequences in decision making.10 Similarly, animal
learning theorists have substantiated evidence that
drug self-administration is a function of the rein-
forcement value of the drug11,12 but also undergoes a
doi: 10.1111/j.1749-6632.2012.06768.x
12 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
transition to automatic control by drug-paired stim-
uli.13 Finally, cognitive neuroscientists have shown
that drug liking is associated with drug-induced
dopamine activation14 and that clinically diagnosed
addiction is accompanied by hypofrontality and ex-
ecutive dysfunction.15 The common theme in all
of these frameworks, therefore, is that initial drug
use is mediated by the drug acting as a positive rein-
forcer, whereas the transition to clinical dependence
is linked to a loss of intentional regulation and con-
comitant emergence of automatic control over drug
seeking.
Learning theory and addiction
The current review aims to detail this transitional
theory of addiction by inspecting human and ani-
mal learning research that has tested the differential
governance of behavior at various stages of drug
exposure. The ideas developed here were first in-
troduced by Norman White who drew a link be-
tween the role of the striatum in memory and ad-
dictive behavior.16,17 The formal associative learning
account was then outlined by Anthony Dickin-
son during symposium proceedings from empiri-
cal work with natural rewards.18 These ideas were
then translated to behavioral neuroscience research
with addictive drugs in collaboration with Trevor
Robbins and Barry Everitt.19–21 Simultaneously, be-
havioral neuroscience research continued with nat-
ural rewards that clarified the associative mecha-
nisms outlined here,22–24 and which are depicted
schematically in Figure 1. According to this per-
spective, experience of the drug outcome is encoded
separately in terms of its specific sensory correlates
or perceptual identity (Oi) and its consummatory,
postingestive or incentive value (Ov), and these two
representations of the drug can differentially enter
into associations.25,26 As a consequence, the agent
(person or animal) acquires four forms of associa-
tive knowledge.
(1) Goal-directed learning. The agent acquires
knowledge of the instrumental contingency
between the drug seeking response and the
drug’s identity and value (R–Oiv). Moreover,
the representation of the drug’s value is up-
dated by internal states, such as deprivation
or satiety, which predict the current value of
the drug. Consequently, retrieval of the rep-
resentation of the drug and its current value
Figure 1. Experience of the drug outcome is separately en-
coded in terms of its perceptual identity (Oi) and incentive
value (Ov) and establishes learning about (1) the goal-directed
instrumental contingency between drug seeking response and
the drug (R–Oiv); (2) the habitual contingency between drug
stimuli and the drug seeking response (S–R); and (3) the Pavlo-
vian contingency between drug stimuli and the drug (S–Oiv ).
It is argued that chronic drug exposure generates a progressive
impairment in capacity to retrieve or utilize the specific iden-
tity of outcomes (Oi), which causes a transition in behavioral
control from the R–Oiv and S–Oiassociations to the S–R and
S–Ovassociations. That is, addiction reflects a loss of control
over behavior by knowledge of the consequences indexed by
outcome devaluation and specific transfer, in favor of control
by antecedent stimuli indexed by devaluation insensitivity and
general transfer.
(Oiv) determines the propensity to select the
associated drug seeking responses from among
competing outcome choices based on a com-
parison of their relative values.27 Thus,ahigher
value drug produces a greater proportion of in-
tentional choice of that outcome from among
alternative rewards.28
(2) Habit learning. The agent forms an associ-
ation between external stimuli (S) and the
drug seeking response (R) in proportion to
the contingent co-occurrence of these two
events before drug reinforcement and the re-
inforcement value of that outcome (Ov).29
This S–R/reinforcement process enables the
drug stimulus, when reencountered, to elicit
the drug seeking response directly without re-
trieving any representation of the drug out-
come. Such habitual drug seeking accords with
the clinical characterization of addiction as
reflecting a loss of intentional regulation of
behavior.
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 13
Abnormal learning underpinning dependence Hogarth et al.
(3) Specific transfer. External stimuli also acquire
an association with the drug outcome in accor-
dance with the predictive contingency between
these events, enabling stimuli to retrieve a rep-
resentation of the drug’s identity and/or value.
Retrieval of the outcome’s identity (S–Oi)can,
in turn, elicit separately trained instrumental
responses that are associated with that same
outcome via a bidirectional O–R, or ideomo-
tor, connection (S–Oi–R).30
(4) General transfer. By contrast, retrieval of the
outcome’s affective value (S–Ov) elicits a mo-
tivational state akin to the drug itself, which
exerts a general excitatory effect on prevailing
responses controlled by the other associations
([S–Ov]–R).31
The claim made in this paper is that these var-
ious forms of behavioral control interact to deter-
mine the propensity to engage in drug seeking at any
given moment. Our claim is that continuing drug
exposure impairs retrieval or utilization of the rep-
resentation of specific outcome identities (Oi), thus
impairing control of action by knowledge of specific
outcomes (R–Oiv and S–Oi–R) toward more general
control over actions by antecedent stimuli (S–R and
[S–Ov]–R). We now turn to empirical evidence for
this psychological account of addiction.
Goal-directed drug seeking
The outcome devaluation procedure provides the
principal method for identifying goal-directed con-
trol.32 A version of this procedure is presented in
Table 1. In this procedure, rats learn that two dif-
ferent lever press responses (R) produce different
rewarding outcomes (O). For example, one lever
may produce drug reward such as alcohol or cocaine
(O1), whereas the other lever produces an alterna-
tive natural reward such as sucrose (O2). The drug
is then devalued by pairing it with lithium chloride-
induced gastrointestinal sickness, specific satiety, or
related manipulation, such that the value of the drug
is diminished. The critical test then comes when
the animal is again given the opportunity to press
the two levers in an extinction test where the re-
sponses no longer produce their respective rewards.
The question at stake is whether the animal will re-
duce responding for the drug outcome (R1 <R2).
Because the outcomes are not presented in the ex-
tinction test, any such devaluation effect cannot be
attributed to S–R/reinforcement (habit) learning,
Tab l e 1. The outcome devaluation procedure used to
demonstrate goal-directed versus habitual control of ac-
tion selection. The agent learns that two responses (R1
and R2) earn different rewarding outcomes (O1 and O2).
One outcome is then devalued (O1 or O2) before an ex-
tinction test in which the agent can again perform the
two responses (R1/R2) without feedback from the out-
comes. A reduction in choice of the response that earned
the devalued outcome (R1=R2) must be goal-directed in
that it is controlled by knowledge of the R–O contingency
and the current incentive value of the O. By contrast, a
null effect of the devaluation treatment on responding in
the extinction test (R1 =R2) suggests responding is ha-
bitual in that it is elicited directly by the stimulus context
without engaging knowledge of the consequences (S–R)
Instrumental Devaluation Extinction
training treatment test
R1–O1 O1 or O2 R1/R2
R2–O2
that is, by experience of the drug outcome modu-
lating the capacity of contextual cues to elicit drug
seeking response. Furthermore, because the proce-
dure contains no stimuli that differentially signal
the two outcomes, a devaluation effect cannot be
attributed to a change in capacity of such cues to
elicit responding for their associated outcomes (S–
Oiv–R). Instead, a reduction in drug choice in the
extinction test must be mediated by animals’ in-
tegration of knowledge of the R–Oiv contingencies
acquired during instrumental training, with knowl-
edge of current low value of the drug outcome (Ov)
acquired during the devaluation treatment, which
together determine the propensity to select that re-
sponse. In other words, a devaluation effect in the
extinction test demonstrates that drug seeking is
goal-directed in that it is determined by the antici-
pated reward value of the drug.
Two studies illustrate the outcome devaluation
procedure in demonstrating goal-directed control
of drug seeking. In a study by Olmstead et al.,33 rats
were trained on a seeking–taking chain in which
they had to press a seeking lever to gain access to
a taking lever, which in turn delivered intravenous
cocaine. To test whether the seeking response was
goal-directed, the taking lever was extinguished by
terminating cocaine delivery. The seeking lever was
14 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
not present during this extinction training. The fact
that this extinction training led to an immediate re-
duction in rats’ performance of the seeking response
in extinction indicated that this response was medi-
ated by knowledge of its consequences, that is, the
low current value of the taking lever.
Hutcheson et al.34 employed a similar design.
Training on a seeking–taking chain for heroin was
followed by a revaluation treatment in which self-
administration via the taking response was expe-
rienced in a withdrawal state to establish the high
value of heroin in this state. Rats were then again
given access to the seeking lever in extinction, and
the finding that withdrawal produced an increase
in performance of the seeking response indicated
that it was goal-directed in that is was mediated by
knowledge of the current high value of the heroin
outcome.
The outcome devaluation procedure has also
been modified for humans.35,36 In the concurrent
training stage of these experiments, smokers learned
two key press responses, where R1 produced tobacco
points and R2 produced chocolate points. Tobacco
was then devalued by smoking to satiety, evaluation
of smoking health warnings, for example, “smok-
ing causes cancer,”36 or by administration of nico-
tine nasal spray.35 The finding that tobacco choice
in the extinction test was sensitive to these deval-
uation treatments (R1 <R2) indicated that it was
goal-directed in being mediated by knowledge of the
current value of the drug outcome.
A key observation replicated in these human ex-
periments was that individual variation in level of
tobacco dependence was associated with a prefer-
ential selection of the tobacco versus the chocolate
response. Similar preferences have been established
in animals11 and human cocaine users37 and con-
firms the economic theorists’ main contention that
drug dependence reflects individual differences in
the reinforcement value of the drug.9The outcome-
devaluation procedure qualifies this notion by dis-
tinguishing the contribution of goal-directed (R–
Oiv) and habitual (S–R) drug seeking to this drug
preference. We know that choice of the drug seek-
ing response was goal-directed as it was sensitive
to devaluation in the extinction test. Any resid-
ual contribution of S–R learning to this drug pref-
erence would be marked by variation in sensitiv-
ity to devaluation treatment in the extinction test.
As there was no systematic variation across levels
of nicotine dependence in sensitivity to devalua-
tion, it may be concluded that preferential tobacco
choice across dependence level was mediated en-
tirely by valuation of the drug as a goal, and not by
differential S–R formation. The conclusion, there-
fore, is that drug seeking within these parameters
is goal-directed, and that level of dependence, at
least at this early stage of drug exposure, reflects the
valuation of the drug as a specific goal (see Refs.
38–42).
Habitual drug seeking
As noted, the outcome devaluation procedure can
evaluate the habitual status of instrumental perfor-
mance32 (see Table 1). Whereas sensitivity of drug
seeking to devaluation in the extinction test (R1 <
R2) signifies goal-directed control, insensitivity to
devaluation in the extinction test (R1 =R2) demon-
strates that retrieval of the current value of the drug
plays no role in drug seeking. Instead, drug seeking
is deemed to have become habitual, being elicited by
contextual stimuli that have acquired a direct S–R
association with drug seeking during instrumental
training, without retrieving a representation of cur-
rent value of the drug.
Two studies illustrate the use of the outcome-
devaluation procedure to demonstrate the habitual
status of drug seeking. In the first study, Dickinson
et al.13 trained rats to acquire two instrumental re-
sponses, one for alcohol and one for food pellets,
before one of these outcomes was devalued by pair-
ing it with lithium chloride-induced sickness. When
the rats were again given the opportunity to respond
for these outcomes in extinction, it was found that
performance of the food-seeking response was re-
duced by the devaluation treatment, indicating that
food seeking was goal-directed. By contrast, perfor-
mance of the alcohol-seeking response was insensi-
tive to devaluation, suggesting that alcohol seeking
had become an S–R habit. A second study used a
similar design to confirm that cocaine seeking was
similarly prone to habitual control compared to nat-
ural reward seeking.43
A question arises as to why habitual drug seek-
ing was established by these two procedures,13,43
whereas goal-directed drug seeking was found in the
earlier designs.33–36 In explaining these divergent re-
sults, one might appeal to a number of variables that
have been demonstrated to modulate the balanceb e-
tween goal-directed and habitual control, including
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 15
Abnormal learning underpinning dependence Hogarth et al.
position of the response within an instrumental se-
quence or chain,44–46 amount of training,47,48 num-
ber of available responses,49 and/or reinforcement
value of the outcome.50 The important point made
by this literature is that goal-directed and habitual
actions exist in a dynamic balance that can be bi-
ased in one direction or the other by conditions of
training or testing that favor acquisition/expression
of the R–O versus S–R association. Our basic argu-
ment is that within this complex system, drugs exert
a constant pressure in favor of the S–R association
by impairing retrieval or utilization of the specific
identity of outcomes.
Corbit et al.51 has recently mapped the progres-
sive transition to habitual control of drug seeking
with extended training. In this study, rats acquired
a self-administration response for alcohol, before
alcohol was devalued by ad libitum consumption
(satiety). Alcohol seeking was then tested in extinc-
tion to evaluate goal-directed control of this be-
havior. Following two weeks of self-administration
training, the response remained sensitive to deval-
uation, but by eight weeks of training, the response
was insensitive to devaluation, suggesting a transi-
tion from goal-directed to habitual control had oc-
curred with training (cocaine seeking shows a sim-
ilar transition to habit with extended training52).
An important additional finding of this study was
that noncontingent administration of alcohol was
sufficient to accelerate habitual control over natural
reward-seeking responses. Thus, not only do drug
self-administration responses become habitual, but
also noncontingent drug exposure renders contem-
poraneously acquired naturally rewarded instru-
mental actions habitual.
In humans, a comparable effect of noncontin-
gent alcohol exposure on habitual control has re-
cently been demonstrated.53 Participants were ad-
ministered with 0.4 g/kg of alcohol or placebo before
instrumental training with R1 and R2 for chocolate
and water,respectively. Chocolate was then devalued
by ad libitum consumption before choice between
R1 and R2 was tested in extinction. The finding
that alcohol attenuated goal-directed control over
chocolate choice in the extinction test supports the
translational relevance of animal models, and sug-
gests that accelerated habit learning can be demon-
strated with acute drug dosing.
A key study by Nelson and Killcross54 re-
vealed that noncontingent drug exposure enhanced
habit formation during instrumental training rather
than at the extinction test. They preexposed rats
to amphetamine for seven days before a seven-
day injection-free period. Instrumental training
for sucrose was then undertaken before this out-
come was devalued by specific satiety or lithium
chloride–induced sickness. The results from the
extinction test indicated that both devaluation
treatments failed to modify sucrose seeking in
the amphetamine-exposed rats, suggesting this re-
sponse had become habitual, whereas placebo rats
showed goal-directed control (see also Refs. 50 and
55). Importantly, chronic amphetamine only accel-
erated habit formation if administered before in-
strumental training, but not if administered after
training. Consistent with this, all the aforemen-
tioned studies that have shown effects of contin-
gent13,43,51 and noncontingent51 drug exposure on
habit learning have undertaken drug administration
contemporaneously with instrumental training.
The implication, therefore, is that during instru-
mental training, the ability of the outcome represen-
tation to enter into new learning may be impaired by
drug exposure, favoring acquisition of the S–R over
the R–Oiv contingencies, but once R–Oiv learning is
acquired drug free, deployment of this knowledge
at test is not impaired by drug exposure.
In reconciling the aforementioned studies, one
can propose a transitional model of addiction
wherein initial drug seeking is goal-directed,33–36
but following extended training comes under ha-
bitual S–R control,13,43 and contemporaneously ac-
quired natural reward seeking also comes under
habitual control.51,53,54 Ultimately, the agent’s be-
havioral repertoire comes to be dominated by S–R
habits.
Specific transfer of stimulus control over drug
seeking
The Pavlovian to instrumental transfer procedure
is the principal method for demonstrating control
over responding by stimuli retrieving a representa-
tion of the specific identity of their paired outcome
(Oi) (e.g., Ref. 56; see Table 2). In this design, rats
are given Pavlovian training in which one stimulus
(S1) signals drug availability (O1), whereas a sec-
ond stimulus (S2) signals the availability of an alter-
native reward, for example sucrose (O2). Separate
instrumental training is then undertaken wherein
rats learn that one lever produces the drug (O1)
16 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
Tab l e 2. The Pavlovian to instrumental transfer proce-
dure used to demonstrate transfer of stimulus control
over action selection. The agent learns that two stimuli
(S1 and S2) predict different rewarding outcomes (O1
and O2). They separately learn that two responses (R1
and R2) differently earn those same rewarding outcomes
(O1 and O2). In the transfer test, the stimuli (S1 and S2)
are presented whilst the responses (R1 and R2) are avail-
able. A specific transfer effect is demonstrated when each
stimulus selectively enhances responding for the same
outcome (S1:R1 >R2, S2:R1 <R2). This specific transfer
effect suggests that each stimulus elicited a representa-
tion of the identity of its associated outcome (Oi), which
in turn elicited its associated response (S–Oi–R). By con-
trast, a general transfer effect is demonstrated when each
stimulus enhances both responses equally above a pre-
or no-stimulus baseline (S1/S2: >R1/R2). This general
transfer effect suggests that each stimulus elicited a rep-
resentation of the value (Ov) but not identity (Oi)ofits
associated outcome which produced a general enhance-
ment of responding ([S–Ov]–R)
Pavlovian Instrumental Extinction
training training test
S1–O1 R1–O1 S1:R1/R2
S2–O2 R2–O2 S2:R1/R2
whereas the other lever produces sucrose (O2). Fi-
nally, in the Pavlovian to instrumental transfer test,
each stimulus is presented for the first time while
the two instrumental responses are available in ex-
tinction. The question at stake is whether each stim-
ulus will enhance performance of the response with
which it shares the same outcome (i.e., S1:R1 >
R2, S2:R1 <R2). Such an outcome-specific trans-
fer effect demonstrates that each stimulus retrieved
a representation of its associated outcome, which
in turn retrieved the response that was associated
with that outcome (S–Oi–R). The effect cannot be
attributed to the formation of an S–R association
because the stimuli and the responses were never
contingently reinforced during training, and fur-
thermore, because the transfer test was conducted
in extinction, so no S–R association can form across
that period either.
There is currently only one demonstration of
outcome-specific transfer of stimulus control over
drug seeking per se57 (although there are many
demonstrations in natural reward learning58). In
this study, smokers first learned that two arbitrary
stimuli (S1 and S2) predicted tobacco points or
money, respectively, before learning that two re-
sponses (R1 and R2) earned tobacco points and
money, respectively. In the transfer test, the two
stimuli were found to selectively enhance perfor-
mance of the response that had earned the same
outcome. Thus, each stimulus must have retrieved
a representation of its associated outcome (points),
which in turn elicited the response that had pro-
duced that outcome (S–Oi–R).
General transfer of stimulus control over drug
seeking
By contrast, in a related animal procedure, Cor-
bit and Janak59 paired S1 and S2 with ethanol or
sucrose, respectively, and then trained R1 and R2
with these same outcomes, respectively. The results
showed that the ethanol stimulus enhanced the rate
of both R1 and R2 equally above a no-stimulus base-
line, indicating that this stimulus exerted a general
excitatory effect on instrumental reward seeking by
retrieving the value (S–Ov) rather than identity (S–
Oi) of the outcome. By contrast, the sucrose stimu-
lus produced a specific transfer effect, selectively en-
hancing instrumental responding for sucrose over
ethanol, indicating that it had retrieved the out-
come’s identity (S–Oi). These data are consistent
with the view that drug-associated cues favor gen-
eral facilitatory effects on appetitive instrumental
responses compared to natural reward-paired cues
(see also Refs. 60 and 61).
The divergent results of these human and an-
imal transfer studies may be resolved by ap-
pealing to Konorski’s view25 that outcomes are
encoded separately in terms of their perceptual iden-
tity (sensory correlates) and consummatory or in-
centive value.26 On this view, the tobacco points
outcome used by Hogarth et al.57 was largely per-
ceptual and minimally consummatory, and so the
stimulus paired with this outcome favored a specific
transfer effect that relied on the retrieval of this out-
come’s perceptual identity (S–Oi–R). By contrast
the ethanol consummatory outcome employed by
Corbit and Janak59 possessed a substantial pharma-
cological/consummatory effect, and so the stimulus
paired with this event favored a general motivational
enhancement based upon retrieval of the outcome
value ([S–Ov]–R).
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 17
Abnormal learning underpinning dependence Hogarth et al.
Other studies substantiate this characterization of
the specific and general forms of stimulus control.62
First, the magnitude of the specific transfer effect is
determined by the reliability of the S–O contingency
in training,63–65 but is insensitive to outcome deval-
uation.31,66–68 Importantly, specific transfer effects
by drug cues on drug seeking in humans are similarly
insensitive to devaluation achieved by drug sati-
ety, health warnings,36,69 and pharmacotherapy.35
Moreover, the finding that drug cue effects on sub-
jective craving70,71 and drug taking69 are similarly
autonomous of devaluation by satiety and pharma-
cotherapy, supports the validity of specific (S–Oi
R) transfer effects in addiction. By contrast, general
transfer effects are modulated by devaluation of the
outcome,31,72 and cross over to other reinforcers of
the same hedonic category.73 Thus, general transfer
effects are deemed to be mediated by the stimu-
lus retrieving a representation of the current value
(Ov) but not identity (Oi) of the outcome, and as
a consequence, the effect is sensitive to changes in
motivational state but is not response selective ([S–
Ov]–R).
Not only does contingent drug exposure cause
drug cues to favor general over specific transfer,59
but noncontingent drug exposure may also cause
natural reward cues to undergo this same transi-
tion. In a recent study, Shiflett et al.74 found that
noncontingent exposure to chronic amphetamine
administered following Pavlovian and instrumen-
tal training (i.e., before the transfer test) abolished
the specific transfer effect and enhanced the general
transfer effect. Specifically, rats received Pavlovian
training in which S1 predicted chocolate and S2
predicted grain. Instrumental training was then un-
dertaken in which two responses, R1 and R2, earned
these same outcomes, respectively. Then, half of rats
were given seven days of amphetamine administra-
tions and the remainder placebo (similar to Ref.
54). Finally, in the transfer test, the two stimuli
were tested for a specific transfer effect in which
each stimulus selectively enhanced responding for
the same outcome, or a general transfer effect in
which each stimulus enhanced responding for both
outcomes equally above a prestimulus baseline. The
remarkable finding was that amphetamine expo-
sure before test abolished the specific transfer effect
and enhanced the general transfer effect. A simi-
lar enhancement of the general transfer effect pro-
duced by natural reward cues on reward seeking has
been found following acute75 and chronic76 am-
phetamine administered before the testing phase,
although in these latter studies no attempt was made
to assess specific transfer.
Overall, these studies favor a transitional model
wherein early in training, drug cues retrieve the
drug’s identity and thus produce specific trans-
fer effects.35,36,57 Extended drug exposure, how-
ever, causes stimuli to lose contact with the drug’s
identity and instead make contact with the drug’s
value, thus causing a transition from specific to
general transfer.59–61 Moreover, contemporaneously
acquired natural reward cues also shift contact
from their outcome’s identity to its value, caus-
ing a comparable transition from specific to general
transfer.61,74–76
Synthesis of psychological studies
The transition to behavioral autonomy depicted
across the studies reported here is consistent with
a singular impairment in the capacity to retrieve or
utilize the specific identity of outcomes as a conse-
quence of drug exposure. This impairment can ex-
plain why drug seeking is initially goal-directed (R–
Oiv) and under specific stimulus control (S–Oi–R),
but then becomes habitual (S–R) and under general
stimulus control ([S–Ov]–R). Whereas the former
two controllers require a representation of the spe-
cific identity of the drug, the latter two controllers
do not. Moreover, the finding of the same transition
in natural reward-seeking responses acquired con-
temporaneously with drug exposure suggests that
the impairment in capacity to represent the specific
outcomes applies to the entire class of appetitive
rewards (it remains to be seen whether aversive out-
comes are similarly affected). Finally, this account
suggests that stress,77 trait impulsivity,78,79 con-
flict,80,81 hypofrontality,82–84 and schizophrenia85,86
may be linked with drug dependence and relapse be-
cause they exacerbate this impairment in capacity to
represent/utilize specific outcome identities.
The claim that a single impairment underpins
both the loss of goal-directed control and the loss of
specific transfer is challenged by a dissociation be-
tween these two effects. Specifically, goal-directed
control was abolished by chronic amphetamine ad-
ministered before training but not administered
before test, suggesting that chronic amphetamine
impairs acquisition of response-outcome knowl-
edge during instrumental training but does not
18 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
directly impair the retrieval/utilization of outcome
identity required for goal-direction control at test.54
By contrast, chronic amphetamine abolished spe-
cific transfer when administered before test,74 sug-
gesting that chronic amphetamine can directly abol-
ish the retrieval/utilization of outcome identity re-
quired for the specific transfer effect. Identifying a
common learning mechanism that operates during
both instrumental training and the transfer test to
produce the observed transition to behavioral au-
tonomy is arguably crucial for isolating the core
associative pathology in addiction.
Neural basis of action control
The following section reviews animal studies that
have examined the neural basis of the four con-
trollers underpinning natural reward and drug seek-
ing. The purpose of this section is to identify sub-
strates upon which chronic drug exposure might
act to produce the transition to autonomy depicted
earlier, that is, reduce goal-directed learning and
specific transfer, and/or enhance habit learning and
general transfer.
Neural basis of goal-directed action
Lesions of the prelimbic (PL) region of the pre-
frontal cortex have been shown to produce precisely
the same deficit in goal-directed control as chronic
amphetamine.54 That is, lesions of the PL abolish
goal-directed control of natural reward seeking if
they occur before instrumental training48,87,88 but
not if they occur before test.89 Comparable effects
have been found following lesions of the mediodor-
sal thalamus, which also abolish acquisition90 but
not expression91 of goal-directed action. As the
mediodorsal thalamus provides the major thalamic
input to the PL, it is believed that these two regions
form a functional circuit. The correspondence of
PL, mediodorsal thalamic lesions, and chronic am-
phetamine exposure on acquisition of goal-directed
control supports these two brain regions in medi-
ating the effect of drug exposure on transition to
behavioral autonomy.
By contrast, the dorsomedial striatum (DMS)
has been shown to be essential for both the ac-
quisition92,93 and the expression94 of goal-directed
learning. Importantly, posttraining DMS inactiva-
tion has been shown to abolish goal-directed con-
trol of alcohol seeking, suggesting common neural
mechanisms underpinning both natural and drug
reward goal-directed learning.51 In addition, lesions
of the basolateral amygdala (BLA) abolish sensitivity
to outcome devaluation whether given before95,96
or after instrumental training.91,97 Thus the DMS
and BLA, in failing to mimic the selective effect
of amphetamine on loss of goal-directed learning
at acquisition, may not play a direct role in drug
sensitization-induced transition to behavioral au-
tonomy.
Neural basis of habitual action
Habitual action, by contrast, is mediated by the dor-
solateral striatum (DLS) and infralimbic cortex. As
noted earlier, overtraining instrumental contingen-
cies favors a transition from R–Oiv to S–R con-
trol, that is, progressive loss of sensitivity to de-
valuation in the extinction test.47 However, rats
with lesions to the DLS either pre- or posttrain-
ing fail to develop habitual control and remain
sensitive to devaluation irrespective of training,
indicating that the DLS is required for the acqui-
sition and expression habit learning.51,98,99 Impor-
tantly, posttraining inactivation of the DLS has also
been shown to abolish expression of habitual co-
caine seeking52 and alcohol seeking51 following ex-
tended training, rendering these behaviors once
again goal-directed, and confirming the common
neural mechanisms underpinning both natural and
drug reward habit learning. In addition, pretraining
functional disconnection of DLS and the amygdala
central nucleus (CN) has also recently been shown to
abolish habitual control of action and restore goal-
directed control.29 Finally, lesions of the infralim-
bic cortex made before instrumental training abol-
ish the transition to habit following overtraining.48
Thus, chronic drug exposure might act on any of
these regions to promote the dominance of S–R
habits.
Neural basis of specific transfer
of stimulus control
The ability of stimuli to transfer selective control
over separately trained instrumental responding
for the same outcome is abolished by pretrain-
ing62 and posttraining lesions of the orbitofrontal
cortex (OFC),100 by pretraining lesions and post-
training inactivation of the nucleus accumbens
(NAC) shell,101,102 by pretraining96,103,104 and post-
training91 lesions of the BLA, and pretrain-
ing functional disconnection between these two
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 19
Abnormal learning underpinning dependence Hogarth et al.
structures.105 In addition, posttraining inactiva-
tion of the DMS106 and posttraining lesions of the
mediodorsal thalamus91 also eliminate the selec-
tive transfer effect. Thus, to impair specific trans-
fer, chronic drug exposure may act on any of these
structures.
Neural basis of general transfer
of stimulus control
The ability of conditioned stimuli to produce a gen-
eral excitatory effect on separately trained instru-
mental responses is abolished by posttraining inac-
tivation of the DLS,106 posttraining inactivation of
the ventral tegmental area (VTA),31,107pre- or post-
training inactivation of the NAC core,102,108 andpre-
training lesions of the amygdala CN.96 Thus, chronic
drug exposure may influence any of these regions to
enhance general transfer effects.
Synthesis of behavioral neuroscience
studies
PL and medial dorsal thalamic lesions show a strik-
ing correspondence with chronic drug exposure
in producing behavioral autonomy. Specifically,
these lesions abolish acquisition but not expression
of goal-directed control,48,87–91 which matches ex-
actly the effect of chronic amphetamine54 (see also
Ref. 51 for related effect with alcohol). However, le-
sions to the PL do not modify the specific transfer
effect,88 but posttraining lesions of the mediodorsal
thalamus do,91 matching the impact of chronic am-
phetamine.74 Thus, lesions of the mediodorsal tha-
lamus produce precisely the same effect as chronic
drug exposure. It is also noteworthy that lesions
of the OFC abolish specific transfer,62,100 but not
outcome devaluation,100 indicating that damage at
this region alone could not produce the exact pat-
tern of chronic drug exposure. Thus, although the
effect of chronic drug exposure on transition to be-
havioral autonomy could be produced by a combi-
nation of PL and OFC damage—a view strength-
ened by the observation of hypofunction in these
regions in addicts109,110—damage to the mediodor-
sal thalamus alone could impair both forms of
behavior control, and so has the advantage of
parsimony.
Conclusions
To conclude, we propose that initial drug seeking
is goal-directed, tracks the anticipated value of the
drug (R–Oiv), and is responsive to specific transfer
(S–Oi–R) effects by drug cues. Chronic drug expo-
sure, however, impairs the capacity to retrieve or
utilization of the specific identity of outcomes, and
so produces a transition of behavioral control from
goal-directed learning (R–Oiv) and specific transfer
(S–Oi–R) to habit (S–R) and general transfer ([S–
Ov]–R). This transition occurs in relation to both
drug outcomes and natural reward outcome, result-
ing in a narrowing of the addict’s behavioral reper-
toire to general cue excitation of dominant S–R drug
habits, with restricted capacity for intentional selec-
tion of alternative actions. This associative frame-
work captures the cardinal diagnostic characteristics
of heightened drug reinforcement, loss of willed reg-
ulation of drug seeking, and restricted engagement
with alternative activities. Future research needs to
clarify precisely how this transition to autonomy is
accelerated by drugs of abuse compared to natural
rewards, whether by differences in reward value,50
kinetics,111 neuroadaptations,112,113 or neurotoxic-
ity114 and precisely how this alters the balance be-
tween corticostriatal circuits underpinning the four
controllers.48,115
There are several implications concerning treat-
ment strategy. Consistent with Tiffany’s7insight, we
have argued that goal-directed action and habit ex-
ist in a dynamic balance which may be additive,32
competitive,46 or hierarchical,45 but switching be-
tween the two modes apparently can occur within
the span of a single response sequence and/or lon-
gitudinally with training. If addiction does reflect
a progressive weakening of the role of outcome
retrieval/utilization in the execution of action se-
quences, allowing drug habits to dominate, then
treatments such as expectancy challenge116 and ex-
tinction training,117 which arguably work by chang-
ing the specific representation of the drug may not
provide the optimal strategy. Instead, treatments
that enhance the capacity to engage representations
of the future such as working memory training118
combined with provisions of alternative reward con-
tingencies119 may be more efficacious in redirect-
ing addicts from their established habits. Moreover,
given that capacity for goal-directed control can be
reinstated by manipulations of brain function48,51,52
and by uncertainty, which has definable neural sub-
strates,46 suggests that neuropharmacology could
complement such learning approaches to install new
intentional action choices.
20 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
Acknowledgments
This work was supported by MRC Grant #G0701456
to L.H., NHMRC Grant #633268 to B.B. and S.K.,
and NHMRC Grant #568872 to S.K. and B.B.
Conflicts of interest
The authors declare no conflicts of interest.
References
1. Wikler, A. 1984. Conditioning factors in opiate addiction
and relapse. J. Subst. Abuse Treat. 1: 279–285.
2. Siegel, S. 1989. Pharmacological conditioning and drug ef-
fects. In Psychoactive Drugs. Tolerance and Sensitisation.A.
Goudie & M. Emmett-Oglesby, Eds.: 115–180. Humana
Press. Clifton, New Jersey.
3. Solomon, R.L. & J.D. Corbit. 1974. An opponent-process
theory of motivation: I. Temporal dynamics of affect. Psy-
chol. Rev. 81: 119–145.
4. Stewart, J., H. de Wit & R. Eikelboom. 1984. Role
of conditioned and unconditioned drug effects in self-
administration of opiates and stimulants. Psychol. Rev. 63:
251–268.
5. Bickel, W.K. et al. 1991. Behavioral economics of drug self-
administration: II. A unit-price analysis of cigarette smok-
ing. J. Exp. Anal. Behav. 55: 145–154.
6. Koob, G.F. & M. Le Moal. 2001. Drug addiction, dysregu-
lation of reward, and allostasis. Neuropsychopharmacology
24: 97–129.
7. Tiffany, S.T. 1990. A cognitive model of drug urges and
drug-use behavior: role of automatic and nonautomatic
processes. Psychol. Re v. 97: 147–168.
8. Robinson, T.E. & K.C. Berridge. 1993. The neural basis
of drug craving: an incentive-sensitization theory of drug
addiction. Brain Res. Rev. 18: 247–291.
9. Heyman, G.M. 2009. Addiction: A disorder of Choice.Har-
vard University Press. Cambridge, MA.
10. MacKillop, J. et al. 2011. Delayed reward discounting and
addictive behavior: a meta-analysis. Psychopharmacology
216: 305–321.
11. Ahmed, S.H. 2010. Validation crisis in animal models of
drug addiction: beyond non-disordered drug use toward
drug addiction. Neu ro sci. Bi ob eh av. Rev. 35: 172–184.
12. Ahmed, S.H. & G.F. Koob. 1998. Transition from moderate
to excessivedrug intake: change in hedonic set point. Science
282: 298–300.
13. Dickinson, A., N. Wood & J.W.Smith. 2002. Alcohol seeking
by rats: action or habit? Q. J. Exp. Psychol. B. 55: 331–348.
14. Drevets, W.C. et al. 2001. Amphetamine-induced dopamine
release in human ventral striatum correlates with euphoria.
Biol. Psychiatry 49: 81–96.
15. Volkow, N.D., J.S. Fowler & G.J. Wang. 2004. The addicted
human brain viewed in the light of imaging studies: brain
circuits and treatment strategies. Neuropharm aco log y 47:
3–13.
16. White, N.M. 1989. A functional hypothesis concerning the
striatal matrix and patches: mediation of S-R memory and
reward. Life Sci. 45: 1943–1957.
17. White, N.M. 1996. Addictive drugs as reinforcers: multiple
partial actions on memory systems. Addiction 91: 921–950.
18. Altman, J. et al. 1996. The biological, social and clinical
bases of drug addiction: commentary and debate. Psy-
chopharmacology 125: 285–345.
19. Robbins, T.W. & B.J. Everitt. 1999. Drug addiction: bad
habits add up. Nature 398: 567–570.
20. Everitt, B.J., A. Dickinson & T.W. Robbins. 2001. The neu-
ropsychological basis of addictive behavior. Brain Res. Rev.
36: 129–138.
21. Everitt, B.J. & T.W. Robbins. 2005. Neural systems of re-
inforcement for drug addiction: from actions to habits to
compulsion. Nat. Neurosci. 8: 1481–1489.
22. Balleine, B.W. & S.B. Ostlund. 2007. Still at the choice-
point—action selection and initiation in instrumental con-
ditioning. Ann. N.Y. Acad. Sci.1104: 147–171.
23. de Wit, S. & A. Dickinson. 2009. Associativetheories of goal-
directed behavior: a case for animal–human translational
models. Psychol. Res. 73: 463–476.
24. Killcross, S. & P. Blundell. 2002. Associative representations
of emotionally significant outcomes. In Emotional Cogni-
tion: From Brain to Behavior. S.C.M.M. Oaksford, Ed.: 35–
73. John Benjamins Publishing Company. Amsterdam, the
Netherlands.
25. Konorski, J.1967. Integrative Activit y oft he Brain.University
of Chicago Press. Chicago.
26. Balleine, B.W. & S. Killcross. 2006. Parallel incentive pro-
cessing: an integrated view of amygdala function. Tre n d s
Neurosci. 29: 272–279.
27. Vlaev, I. et al. 2011. Does the brain calculate value? Trend s
Cogn. Sci. 15: 546–554.
28. Hursh, S.R. & A. Silberberg. 2008. Economic demand and
essential value. Psychol. Rev. 115: 186–198.
29. Lingawi, N.W. & B.W. Balleine. 2012. Amygdala central
nucleus interacts with dorsolateral striatum to regulate the
acquisition of habits. J. Neurosci. 32: 1073–1081.
30. Hommel, B. 2013. Ideomotor action control: On the per-
ceptual grounding of voluntary actions and agents. In W.
Prinz, M. Beisert & A. Herwig (Eds.), Action science: Foun-
dations of an emerging discipline (pp. 113–136). Cambridge,
MA: MIT Press.
31. Corbit, L.H., P.H. Janak & B.W. Balleine. 2007. General and
outcome-specific forms of Pavlovian-instrumentaltr ansfer:
the effect of shifts in motivational state and inactivation
of the ventral tegmental area. Eur. J. Neurosci. 26: 3141–
3149.
32. Dickinson, A. 1985. Actions and habits—the development
of behavioral autonomy. Philos. Trans. R. Soc. Lond. Ser. B
Biol. Sci. 308: 67–78.
33. Olmstead, M.C. et al. 2001. Cocaine seeking by rats is a
goal-directed action. Behav. Neurosci. 115: 394–402.
34. Hutcheson, D.M. et al. 2001. The role of withdrawal in
heroin addiction: enhances reward or promotes avoidance?
Nat. Neurosci. 4: 943–947.
35. Hogarth, L. 2012. Goal-directed and transfer-cue-elicited
drug-seeking are dissociated by pharmacotherapy: Evi-
dence for independent additive controllers. J. Exp. Psychol.
Anim. Behav. Processes .38: 266–278.
36. Hogarth, L. & H.W. Chase. 2011. Parallel goal-directed and
habitual control of human drug seeking: implications for
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 21
Abnormal learning underpinning dependence Hogarth et al.
dependence vulnerability. J. Exp. Psychol. Anim. Behav. Pro-
cess. 37: 261–276.
37. Moeller, S.J. et al. 2009. Enhanced choice for viewing co-
caine pictures in cocaine addiction. Biol. Psychiatry. 66:
169–176.
38. Fergusson, D.M. et al. 2003. Early reactions to cannabis
predict later dependence. Arch.Gen.Psychiatry60: 1033–
1039.
39. de Wit, H., E.H. Uhlenhuth & C.E. Johanson. 1986. Indi-
vidual differences in the reinforcing and subjective effects
of amphetamine and diazepam. Drug Alcohol Depend. 16:
341–360.
40. Scherrer, J.F. et al. 2009. Subjective effects to cannabis are
associated with use, abuse and dependence after adjust-
ing for genetic and environmental influences. Drug Alcohol
Depend. 105: 76–82.
41. Stoops, W.W. et al. 2007. The reinforcing, subject-rated,
performance, and cardiovascular effects of d-amphetamine:
influence of sensation-seeking status. Addict. Behav. 32:
1177–1188.
42. Pomerleau, O. 1995. Individual differences in sensitivity
to nicotine: implications for genetic research on nicotine
dependence. Behav. Genet. 25: 161–177.
43. Miles, F.J., B.J. Everitt & A. Dickinson. 2003. Oral cocaine
seeking by rats: action or habit? Behav. Neurosci. 117: 927–
938.
44. Balleine, B.W. et al. 1995. Motivational control of hetero-
geneous instrumental chains. J. Exp. Psychol. Anim. Behav.
Process. 21: 203–217.
45. Dezfouli, A. & B.W. Balleine. 2012. Habits, action sequences
and reinforcement learning. Eur. J. Neurosc i. 35: 1036–1051.
46. Daw, N.D., Y. Niv & P. Dayan. 2005. Uncertainty-based
competition between prefrontal and dorsolateral striatal
systems for behavioral control. Nat. Neurosci.8: 1704–1711.
47. Dickinson, A. et al. 1995. Motivational control after ex-
tended instrumental training. Anim.Learn.Behav.23: 197–
206.
48. Killcross, S. & E. Coutureau. 2003. Coordination of actions
and habits in the medial prefrontal cortex of rats. Cereb.
Cortex. 13: 400–408.
49. Kosaki, Y. & A. Dickinson. 2010. Choice and contingency
in the development of behavioral autonomy during instru-
mental conditioning. J. Exp. Psychol. Anim. Behav. Process.
36: 334–342.
50. Nordquist, R.E. et al. 2007. Augmented reinforcer value
and accelerated habit formation after repeated am-
phetamine treatment.Eur. Neuropsychopharmacol. 17: 532–
540.
51. Corbit, L.H., H. Nie & P.H. Janak. Habitual alcohol seeking:
time course and the contribution of subregions of the dorsal
striatum. Biol. Psychiatry.72: 389–395. In press.
52. Zapata, A., V.L. Minney & T.S. Shippenberg. 2010. Shift
from goal-directed to habitual cocaine seeking after pro-
longed experience in rats. J. Neurosci. 30: 15457–15463.
53. Hogarth, L. et al. 2012. Acute alcohol impairs human goal-
directed action. Biol. Psychol. 90: 154–160.
54. Nelson, A. & S. Killcross. 2006. Amphetamine exposure
enhances habit formation. J. Neurosci. 26: 3805–3812.
55. Schoenbaum, G. & B. Setlow. 2005. Cocaine makes actions
insensitive to outcomes but not extinction: implications for
altered orbitofrontal-amygdalarfunction. Ce reb. Cortex. 15:
1162–1169.
56. Colwill, R.M. & R.A. Rescorla. 1988. Associations between
the discriminative stimulus and the reinforcer in instru-
mental learning. J. Exp. Psychol. Anim. Behav. Process. 14:
155–164.
57. Hogarth, L. et al. 2007. The role of drug expectancy in the
control of human drug seeking. J. Exp. Psychol.Anim. Behav.
Process. 33: 484–496.
58. Holmes, N.M., A.R. Marchand & E. Coutureau. 2010.
Pavlovian to instrumental transfer: a neurobehavioral per-
spective. Neurosci. Biobehav. Rev. 34: 1277–1295.
59. Corbit, L.H. & P.H. Janak. 2007. Ethanol-associated cues
produce general Pavlovian-instrumental transfer. Alcohol.
Clin.Exp.Res.31: 766–774.
60. Krank, M.D. 2003. Pavlovian conditioning with ethanol:
sign-tracking (autoshaping), conditioned incentive, and
ethanol self-administration. Alcohol. Clin. Exp. Res. 27:
1592–1598.
61. Glasner, S.V., J.B. Overmier & B.W. Balleine. 2005. The
role of Pavlovian cues in alcohol seeking in dependent and
nondependent rats. J. Stud. Alcohol. 66: 53–61.
62. Balleine, B.W., B.K. Leung & S.B. Ostlund. 2011. The or-
bitofrontal cortex, predicted value, and choice. Ann. N. Y.
Acad. Sci. 1239: 43–50.
63. Delamater, A.R. 1995. Outcome-selective effects of inter-
trial reinforcement in Pavlovian appetitive conditioning
with rats. Anim.Learn.Behav.23: 31–39.
64. G´
amez, A.M. & J.M. Rosas. 2005. Transfer of stimulus con-
trol across instrumental responses is attenuated by extinc-
tion in human instrumental conditioning. Int. J. Psychol.
Psychol. Ther. 5: 207–222.
65. Trick, L., L. Hogarth & T. Duka. 2011. Prediction and
uncertainty in human Pavlovian to instrumental transfer.
J. Exp. Psychol. Learn. Mem. Cogn. 37: 757–765.
66. Rescorla, R.A. 1994. Transfer of instrumental control me-
diated by a devalued outcome. Anim.Learn.Behav.22: 27–
33.
67. Holland, P.C. 2004. Relations between Pavlovian-
instrumental transfer and reinforcer devaluation. J. Exp.
Psychol. Anim. Behav. Process. 30: 258–258.
68. Colwill, R.M. & R.A. Rescorla. 1990. Effects of reinforcer
devaluation on discriminative control of instrumental be-
havior. J. Exp. Psychol. Anim. Behav. Process. 16: 40–47.
69. Hogarth, L., A. Dickinson & T. Duka. 2010. The associative
basis of cue elicited drug taking in humans. Psychopharma-
cology 208: 337–351.
70. Ferguson, S.G. & S. Shiffman. 2009. The relevance and
treatment of cue-induced cravings in tobacco dependence.
J. Subst. Abuse Treat. 36: 235–243.
71. Hitsman, B. et al. Dissociable effect of acute varenicline
on tonic versus cue-provoked craving in non-treatment
motivated heavy smokers. Drug Alcohol Depend.Inpress.
72. Dickinson, A. & G.R. Dawson. 1987. Pavlovian processes
in the motivational control of instrumental performance.
Q. J. Exp. Psychol. B 39: 201–213.
73. Mitchell, J.B. & J. Stewart. 1990. Facilitation of sexual be-
haviors in the male rat in the presence of stimuli previously
paired with systemic injections of morphine. Pharmacol.
Biochem. Behav. 35: 367–372.
22 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
Hogarth et al. Abnormal learning underpinning dependence
74. Shiflett, M. 2012. The effects of amphetamine exposure on
outcome-selective Pavlovian-instrumental transfer in rats.
Psychopharmacology 223: 361–370.
75. Wyvell, C.L. & K.C. Berridge. 2000. Intra-accumbens am-
phetamine increases the conditioned incentive salience of
sucrose reward: enhancement of reward “wanting” without
enhanced “liking” or response reinforcement. J. Neurosci.
20: 8122–8130.
76. Wyvell, C.L. & K.C. Berridge. 2001. Incentive sensitiza-
tion by previous amphetamine exposure: increased cue-
triggered “wanting” for sucrose reward. J. Neurosci. 21:
7831–7840.
77. Schwabe, L., A. Dickinson & O.T. Wolf. 2011. Stress, habits
and drug addiction: a psychoneuroendocrinological per-
spective. Exp. Clin. Psychopharmacol. 19: 53–63.
78. Hogarth, L. 2011. The role of impulsivity in the aetiology
of drug dependence: reward sensitivity versus automaticity.
Psychopharmacology 215: 567–580.
79. Hogarth, L., H.W. Chase & K. Baess. 2012. Impaired goal-
directed behavioral control in human impulsivity. Q. J. Exp.
Psychol. 65: 305–316.
80. de Wit, S. et al. 2006. Dorsomedial prefrontal cortex resolves
response conflict in rats. J. Neurosci. 26: 5224–5229.
81. Ostlund, S.B., N.T. Maidment & B.W. Balleine. 2010.
Alcohol-paired contextual cues produce an immediate and
selective loss of goal-directed action in rats. Front. Integr.
Neurosci. 4: 19.
82. Gillan, C.M. et al. 2011. Disruption in the balance between
goal-directed behavior and habit learning in obsessive-
compulsive disorder. Am.J.Psychiatry.168: 718–726.
83. Valentin, V., A. Dickinson & J.P. O’Doherty. 2007. Deter-
mining the neural substrates of goal-directed learning in
the human brain. J. Neurosci. 27: 4019–4026.
84. Klossek, U.M., J. Russell & A. Dickinson. 2008. The control
of instrumental action following outcome devaluation in
young children aged between 1 and 4 years. J. Exp. Psychol.
Gen. 137: 39–51.
85. Haddon, J.E. et al. 2010. Impaired conditional task perfor-
mance in a high schizotypy population: relation to cognitive
deficits. Q. J. Exp. Psychol. 64: 1–9.
86. Barch, D.M. & A. Ceaser. 2012. Cognition in schizophrenia:
core psychological and neural mechanisms. Tre n ds Co g n .
Sci. 16: 27–34.
87. Balleine, B.W. & A. Dickinson. 1998. Goal-directed in-
strumental action: contingency and incentive learning
and their cortical substrates. Neuropharm aco log y 37: 407–
419.
88. Corbit, L.H. & B.W. Balleine. 2003. The role of prelimbic
cortex in instrumental conditioning. Behav. Brain Res. 146:
145–157.
89. Ostlund, S.B. & B.W. Balleine. 2005. Lesions of medial pre-
frontal cortex disrupt the acquisition but not the expression
of goal-directed learning. J. Neurosci. 25: 7763–7770.
90. Corbit, L.H., J.L. Muir & B.W. Balleine. 2003. Lesions of
mediodorsal thalamus and anterior thalamic nuclei pro-
duce dissociable effects on instrumental conditioning in
rats. Eur. J. Neurosci. 18: 1286–1294.
91. Ostlund, S.B. & B.W. Balleine. 2008. Differential involve-
ment of the basolateral amygdala and mediodorsal thala-
mus in instrumental action selection. J. Neurosci. 28: 4398–
4405.
92. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2005. Blockade
of NMDA receptors in the dorsomedial striatum prevents
action-outcome learning in instrumental conditioning. Eur.
J. Neurosci. 22: 505–512.
93. Corbit, L.H. & P.H. Janak. 2010. Posterior dorsomedial
striatum is critical for both selective instrumental and
Pavlovian reward learning. Eur. J. Neurosci. 31: 1312–1321.
94. Yin, H.H. et al. 2005. The role of the dorsomedial striatum
in instrumental conditioning. Eur. J. Neurosci. 22: 513–523.
95. Balleine, B.W., A.S. Killcross & A. Dickinson. 2003. The
effect of lesions of the basolateral amygdala on instrumental
conditioning. J. Neurosci. 23: 666–675.
96. Corbit, L.H. & B.W. Balleine. 2005. Double dissociation of
basolateral and central amygdala lesions on the general and
outcome-specific forms of pavlovian-instrumental transfer.
J. Neurosci. 25: 962–970.
97. Johnson, A.W., M. Gallagher & P.C. Holland. 2009. The ba-
solateral amygdala is critical to the expression of Pavlovian
and instrumental outcome-specific reinforcer devaluation
effects. J. Neurosci. 29: 696–704.
98. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2004. Lesions
of dorsolateral striatum preserve outcome expectancy but
disrupt habit formation in instrumental learning. Eur. J.
Neurosci. 19: 181–189.
99. Yin, H.H., B.J. Knowlton & B.W. Balleine. 2006. Inactiva-
tion of dorsolateral striatum enhances sensitivity to changes
in the action-outcome contingency in instrumental condi-
tioning. Behav. Brain Res. 166: 189–196.
100. Ostlund, S.B. & B.W. Balleine. 2007. Orbitofrontal cortex
mediates outcome encoding in pavlovian but not instru-
mental conditioning. J. Neurosci. 27: 4819–4825.
101. Corbit, L.H., J.L. Muir& B.W. Balleine. 2001. The role of the
nucleus accumbens in instrumental conditioning: evidence
of a functional dissociation between accumbens core and
shell. J. Neurosci. 21: 3251–3260.
102. Corbit, L. & B. Balleine. 2011. The general and outcome-
specific forms of Pavlovian-Instrumental transfer are dif-
ferentially mediated by the nucleus accumbens core and
shell. JNeurosci.31: 11786–11794.
103. Blundell, P., G. Hall & S. Killcross. 2001. Lesions of the
basolateral amygdala disrupt selective aspects of reinforcer
representation in rats. J. Neurosci. 21: 9018–9026.
104. Holland, P.C. & M. Gallagher. 2003. Double dissociation of
the effects of lesions of basolateral and central amygdala on
conditioned stimulus-potentiated feeding and Pavlovian-
instrumental transfer. Eur. J. Neurosci. 17: 1680–1694.
105. Shiflett, M.W. & B.W. Balleine. 2010. At the limbic–motor
interface: disconnection of basolateral amygdala from nu-
cleus accumbens core and shell reveals dissociable compo-
nents of incentive motivation. Eur. J. Neurosci. 32: 1735–
1743.
106. Corbit, L.H. & P.H. Janak. 2007. Inactivation of the lateral
but not medial dorsal striatum eliminates the excitatory
impact of Pavlovian stimuli on instrumental responding.
J. Neurosci. 27: 13977–13981.
107. Murschall, A. & W. Hauber. 2006. Inactivation of the ventral
tegmental area abolished the general excitatory influence of
Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences. 23
Abnormal learning underpinning dependence Hogarth et al.
Pavlovian cues on instrumental performance. Learn. Mem.
13: 123–126.
108. Hall, J. et al. 2001. Involvement of the central nucleus of
the and nucleus accumbens core in mediating Pavlovian
influences on instrumental behavior. Eur. J. Neurosci. 13:
1984–1992.
109. Chase, H.W. etal . 2008. The role of the orbitofrontal cortex
in human discrimination learning. Neuropsychologia 46:
1326–1337.
110. Wilson, S.J., M.A. Sayette & J.A. Fiez. 2004. Prefrontal re-
sponses to drug cues: a neurocognitive analysis. Nat. Neu-
rosci. 7: 211–214.
111. Farr´
e, M. & J. Cam´
ı. 1991. Pharmacokinetic considera-
tions in abuse liability evaluation. Addiction 86: 1601–
1606.
112. Wickens, J.R. et al. 2007. Dopaminergic mechanisms in
actions and habits. J. Neurosci. 27: 8181–8183.
113. Jedynak, J.P. et al . 2007. Methamphetamine-induced struc-
tural plasticity in the dorsal striatum. Eur. J. Neurosci. 25:
847–853.
114. Cunha-Oliveira, T., A.C. Rego & C.R. Oliueira. 2008. Cel-
lular and molecular mechanisms involved in the neurotox-
icity of opioid and psychostimulant drugs. Brain Res. Rev.
58: 192–208.
115. Balleine, B.W. & J.P. O’Doherty. 2010. Human and rodent
homologies in action control: corticostriatal determinants
of goal-directed and habitual action. Neuropsychopharma-
cology 35: 48–69.
116. Jones, B.T. & R.M. Young. 2011. Changing alcohol ex-
pectancies and self-efficacy expectations. In Handbook of
Motivational Counseling: Goal-Based Approaches to Assess-
ment and Intervention with Addiction and Other Problems.
W.M. Cox & E. Klinger, Eds.: 489–504. John Wiley & Sons,
Ltd.
117. Bouton, M.E. 2002. Context, ambiguity, and unlearning:
sources of relapseafter behavioral extinction. Biol Psychiatr y
52: 976–986.
118. Bickel, W.K. et al. 2011. Remember the future: working
memory training decreases delay discounting among stim-
ulant addicts. Biol. Psychiatry 69: 260–265.
119. Quick, S.L. et al. 2011. Loss of alternative non-drug rein-
forcement induces relapse of cocaine-seeking in rats: role
of dopamine D1 receptors. Neuropsychopharmacology 36:
1015–1020.
24 Ann. N.Y. Acad. Sci. 1282 (2013) 12–24 c
2012 New York Academy of Sciences.
... Most salient is the context and environments in which mobile gambling is conducted compared to other types of gambling. Environmental cues can produce urges to engage in a behaviour or trigger the initialisation of the behaviour itself (Hogarth et al., 2013). Although often played in the home, both selfreported and behavioural data indicate that mobile gambling is used as a pastime in contexts as varied as work, commuting, and socialising (R. Gambling Commission, 2020; R. . ...
Article
Full-text available
Driven by the ubiquity of smartphones, sports gambling has intensified globally. Most mobile gambling apps are mandated to offer harm minimisation features which are IT tools designed to help prevent harmful gambling activity. Existing research on the effectiveness of gambling harm minimisation features often overlooks the fact that individuals engage with multiple IT tools to varying extents to achieve a single goal. As an initial step, and to reflect actual user engagement, we conduct an exploratory factor analysis on a range of opt-in harm minimisation features. Next, aligned with the dualistic model of passion, we theorise and empirical test how direct and indirect harm minimisation features moderate the translation of different passions for mobile gambling into the well-being outcome of subjective vitality. Our findings suggest that indirect harm minimisation features, but not direct features, are effective in protecting the well-being of obsessively passionate mobile gamblers. For harmoniously passionate mobile gamblers, the opposite situation holds-direct harm minimisation features strengthen the effect of a harmonious passion on vitality whereas indirect features have no significant effect.
... Our longitudinal data, collected between 2011 and 2018, provide some empirical support for the Gateway Hypothesis [31]; that is, consumption of tobacco products precedes cannabis use. Besides the assumption that nicotine might prime the brain for use of other substances [71], gateway effects could also reflect common liability effects (e.g., individual vulnerability) [72], or the transfer of learning (i.e., learning the rewarding effects of one substance increases the likelihood of instrumentalizing the rewarding effects of a new substance) [73,74] In addition, drug availability, especially in early adolescence (e.g., through peer affiliations) might also play an important role in the transition from tobacco use to cannabis use. Interestingly, in later adolescence, a reciprocal association was found, suggesting a possible reverse gateway effect [75] from late adolescence to early adulthood. ...
Article
Full-text available
Associations among self-control, substance use (e.g., tobacco and cannabis use), and violence perpetration have been documented during the adolescent years, but the direction of these associations is not well understood. Using five assessments (covering 9 years) from a prospective-longitudinal study, we examined self-control as a precursor and subsequent mechanism of associations between adolescent substance use and physical violence perpetration. Data came from a large, ethnically diverse sample (n = 1,056). Youth reported their self-control at ages 11, 13, 15, 17, and 20; and their tobacco and cannabis use, and physical violence perpetration at ages 13, 15, 17, and 20. Cross-lagged panel analyses examined associations between these constructs over time. More self-control in late childhood and early adolescence was associated with less future tobacco and cannabis use and physical violence perpetration. Tobacco use was partially associated with more physical violence over time; these associations were not mediated by self-control. Tobacco use in early adolescence was associated with future cannabis use; during late adolescence, tobacco and cannabis use were reciprocally associated over time. Cannabis use was not associated with future physical violence perpetration. Early adolescent self-control plays an important role in later substance use and violence perpetration, and tobacco use has unique links with both later cannabis use and violence perpetration. Supporting the capacities for self-control in late childhood and early adolescence and preventing the initiation and use of entry-level substances could play an important role in preventing both substance use and violence perpetration and their many costs to society.
... As in the example above, gaming cues may promote gaming behavior as well as other non-gaming mobile phone use behavior. Prolonged exposure to addictive substances may endow cues with incentive salience, rendering them generally appetizing for other instrumental behaviors; i.e., the general PIT effect (Hogarth, Balleine, et al., 2013). For instance, cues associated with amphetamine (Shiflett, 2012) or alcohol (Corbit et al., 2016;Corbit & Janak, 2007) have been found to increase the frequency of food-seeking behaviors in rodents. ...
Article
Individuals with addictions often encounter environmental cues that may trigger repeated engagement in addictive behaviors despite adverse consequences. In substance use disorders, Pavlovian cues may influence instrumental behaviors (Pavlovian-to-instrumental transfer or PIT), and dominant habitual control that is insensitive to outcome values may serve as a foundational mechanism in addiction development. Although existing research suggests learning associations play important roles in internet gaming disorder (IGD), the contributions of PIT effects, habitual control, and connections with behavior remain largely uninvestigated. The present study sought to examine specific transfer effects of monetary cues, general transfer effects of gaming-related cues, and habitual control over instrumental behaviors as reflected in devaluation sensitivity, probing their interrelations and associations with gaming behaviors. Forty-five adults with IGD and 42 adults with recreational game use (RGU) at baseline performed a PIT task with a devaluation procedure. Participants reported gaming behavior and addiction severity at baseline and a four-month follow-up. Results demonstrated (1) a greater specific transfer effect in the IGD group compared to the RGU group; (2) positive correlations between specific and general transfer effects in both groups; (3) specific transfer effects of Go behavior in the IGD group positively correlating with habitual control, and No-Go behavior in the RGU group negatively correlating with habitual control; (4) transfer effects and habitual control relating prospectively to participants' gaming phenotypes at the four-month follow-up. These findings offer early evidence linking Pavlovian and instrumental associations to the development of IGD and suggest potential targets for prevention and intervention strategies.
... Such shifts in memory may profoundly impact subsequent drinking behavior. Memory has long been acknowledged to play an important role in alcohol use and relapse, with suggestions that different types of memory representations may each give rise to different addiction-relevant behaviors (Goldfarb and Sinha, 2018;Goodman and Packard, 2016;Hogarth et al., 2013;White, 1996). For example, strong memories for single alcohol-related cues can promote approach behavior or even act as reinforcers to potentiate learning new alcohol-seeking behavior, whereas strong memories for alcohol-related contexts may promote motivation and increased focus on alcohol when in those contexts (White, 1996). ...
Article
Full-text available
Stress can powerfully influence the way we form memories, particularly the extent to which they are integrated or situated within an underlying spatiotemporal and broader knowledge architecture. These different representations in turn have significant consequences for the way we use these memories to guide later behavior. Puzzlingly, although stress has historically been argued to promote fragmentation, leading to disjoint memory representations, more recent work suggests that stress can also facilitate memory binding and integration. Understanding the circumstances under which stress fosters integration will be key to resolving this discrepancy and unpacking the mechanisms by which stress can shape later behavior. Here, we examine memory integration at multiple levels: linking together the content of an individual experience, threading associations between related but distinct events, and binding an experience into a pre-existing schema or sense of causal structure. We discuss neural and cognitive mechanisms underlying each form of integration as well as findings regarding how stress, aversive learning, and negative affect can modulate each. In this analysis, we uncover that stress can indeed promote each level of integration. We also show how memory integration may apply to understanding effects of alcohol, highlighting extant clinical and preclinical findings and opportunities for further investigation. Finally, we consider the implications of integration and fragmentation for later memory-guided behavior, and the importance of understanding which type of memory representation is potentiated in order to design appropriate interventions.
... Given the substantial body of evidence interpreting maladaptive behaviors like addiction as an imbalance between goal-directed and habitual control (e.g., Everitt & Robbins 2005Hogarth et al., 2013;Robbins et al., 2008, for reviews), and their proclivity to reemerge after successful extinction (e.g., Fredriksson et al., 2021, for a review), it is of both theoretical and clinical importance to explore how goal-directed and habitual behavior would recover after extinction. Nevertheless, such investigations remain limited, with only a few studies focusing on the renewal effect (Cohen-Hatton & Honey, 2013;Steinfeld & Bouton, 2020, even though numerous studies have focused on factors that influence the shift between goal-directed and habitual control during the acquisition of operant behavior (e.g., Dickinson et al., 1983;Garr et al., 2020;Kosaki & Dickinson, 2010;Urcelay & Jonkman, 2019). ...
Article
This study investigated how goal-directed and habitual behaviors recover after extinction within the context of the resurgence effect, a form of relapse induced by the removal or worsening of alternative reinforcement. Rats were trained to press a target lever with one reinforcer (O1) for either minimal (4) or extended (16) sessions. An extinction test after the completion of O1 devaluation confirmed that minimal and extended training formed goal-directed and habitual behaviors, respectively. Then, pressing an alternative lever was reinforced with a second reinforcer (O2) while the target response was placed on extinction. When O2 was discontinued, the minimally trained target response resurged with goal-directed status as in the extinction test. However, the extinguished habitual behavior in the extensively trained rats did not recover as a habit but instead with goal-directed status, possibly due to the context specificity of habits or the introduction of a new response-reinforcer contingency. The critical finding that reinforcer devaluation consistently led to less resurgence regardless of the amount of acquisition training provides a clinical implication that coupling differential-reinforcement-of-alternative-behavior (DRA) treatments with the devaluation of the associated reinforcer of problematic behavior could effectively diminish its recurrence.
... While the study of habit formation stimulated many clinical and preclinical studies concerning a variety of human pathological conditions hallmarked by cognitive/behavioral rigidity and autonomy, such as addiction (e.g., Everitt & Robbins, 2005Furlong et al., 2018;Hogarth et al., 2013Hogarth et al., , 2019Miles et al., 2003;Vandaele & Ahmed, 2021), obsessive-compulsive disorder (e.g., Gillan et al., 2011;Gillan & Robbins, 2014), and autism spectrum disorder (e.g., Alvares et al., 2016;Geurts & De Wit, 2014), it is equally important to understand how actions and habits can recover after initial extinction of that behavior. This is particularly relevant in clinical fields where attempts are made to modify and extinguish maladaptive instrumental behaviors. ...
Article
Full-text available
Three experiments with rats explored whether previously extinguished goal-directed and habitual responding recover with the same status using an ABA renewal preparation. In Experiments 1a and 1b, a lever-press response was minimally (four sessions) or extensively (16 sessions) trained in one context (Context A) and extinguished in another context (Context B). Then, outcome devaluation took place in either Context A or Context B in which a food pellet reinforcing the response was paired with lithium chloride (LiCl) for devalued groups and with saline for a control group. Finally, renewal of the extinguished response was tested in both Contexts A and B. We confirmed that both minimally and extensively trained responses renewed as goal-directed action regardless of the context in which devaluation took place. This finding was replicated in Experiment 2 even after more extended acquisition training (32 sessions). However, another group that received outcome devaluation before but not after extinction training showed habitual performance during extinction training as well as in a subsequent renewal test. Experiment 3 replicated these results and confirmed that renewal of goal direction for rats that received extinction training immediately prior to outcome devaluation was not an artifact of consecutive LiCl exposures over a short period of time in Experiments 1 and 2, using a more reliable devaluation protocol. Overall, the present results extend previous findings suggesting that actions and habits renew with the same status by returning to the original context after extinction. The most critical finding is the differential effects of pre- and postextinction devaluation on the expression of habitual behavior; extinction prior to devaluation may convert a habitual performance into a goal-directed action. This novel finding is discussed in relation to recent studies that identified several factors contributing to a transition from habitual to goal-directed control of instrumental behavior.
Chapter
The outcome devaluation task is a powerful method of distinguishing between goal-directed and habitual behavior by modifying the value of the instrumentally earned outcome and examining the impact on subsequent performance. Through its use, researchers have been able to characterize the neural underpinnings of goal-directed and habitual control and describe circumstances that promote either control system such as training conditions or exposure to addictive substances. This chapter provides an overview of behavioral control, how it is understood using the outcome devaluation task, and the advantages and limitations of the task itself.
Chapter
Learning theory has proposed that everyday actions are controlled by at least two dissociable systems: one that governs deliberate goal-directed actions and another that regulates automatic habits; these two systems are thought to have inherently different functions. The goal-directed system guides actions toward achieving specific outcomes; this system can flexibly adapt to changing situations, but it is more effortful. By contrast, the habit system automates repeated actions by linking them to environmental cues; this system enhances one’s efficiency by making actions more automatic, but at the cost of reduced sensitivity to the consequences. A large body of neuroscientific evidence suggests that an optimal balance between these two systems is critical to support adaptive behavior, and a breakdown of this balance may induce maladaptive behaviors. An example of this breakdown can be seen in the context of drug addiction, a psychiatric disorder characterized by maladaptive drug use that spirals out of control. Patients with drug addiction not only prioritize drug use over other key aspects of their lives (e.g., work, school, interpersonal relationships), they also struggle to reduce their drug use; in severe cases, their drug use may even persist despite recurrent physical and psychological harm. Even if patients manage to abstain from drugs for prolonged periods, exposure to environments that have previously been associated with drugs often triggers relapse. These clinical symptoms of drug addiction have been hypothesized to reflect an imbalance between goal-directed and habitual control over behavior, with a bias toward the latter. In this chapter, we review recent experimental work in humans in support of this hypothesis.
Chapter
This chapter reviews recent research from the author’s laboratory on habit and goal-direction in instrumental learning and then considers some of its implications for a general view of instrumental behavior and addiction. Results suggest that habit develops under conditions that allow the individual to pay less attention to its behavior, i.e., when the habit’s trigger cue reliably predicts the reward. Other results suggest that a behavior’s status as a habit is not necessarily fixed or permanent; several environmental manipulations can make a habitual behavior become goal-directed again. Habit is more context-specific than goal-direction. The perspective that emerges suggests that habit may have an important but perhaps more circumscribed role in instrumental behavior (and addiction) than might often be assumed. For example, drug seeking can appear adaptable and flexible because behaviors that are more distal to the goal (e.g., general search behaviors) may be goal-directed at the same time behaviors that are more proximal to the goal (e.g., actual drug-taking responses) are habitual. And individuals with substance use habits might not appear more habit-prone than controls when they are tested for habit in the context of the lab. These and other challenges that have been raised for the role of habit in addiction are discussed.
Preprint
Full-text available
Chronic stress can change how we learn and, thus, how we make decisions by promoting the formation of inflexible, potentially maladaptive, habits. Here we investigated the neuronal circuit mechanisms that enable this. Using a multifaceted approach in male and female mice, we reveal a dual pathway, amygdala-striatal, neuronal circuit architecture by which a recent history of chronic stress shapes learning to disrupt flexible goal-directed behavior in favor of inflexible habits. Chronic stress inhibits activity of basolateral amygdala projections to the dorsomedial striatum to impede the action-outcome learning that supports flexible, goal-directed decisions. Stress also increases activity in direct central amygdala projections to the dorsomedial striatum to promote the formation of rigid, inflexible habits. Thus, stress exerts opposing effects on two amygdala-striatal pathways to promote premature habit formation. These data provide neuronal circuit insights into how chronic stress shapes learning and decision making, and help understand how stress can lead to the disrupted decision making and pathological habits that characterize substance use disorders and other psychiatric conditions.
Article
Full-text available
Two experiments examined the effect of reinforcer devaluation on the ability of a discriminative stimulus (Sd) to control instrumental behavior in Sprague-Dawley rats. In Experiment 1 reinforcer devaluation reduced, but did not eliminate, the ability of the Sd to control performance of the original response and to transfer its control to a new response trained with the same reinforcer. The effect of devaluation was more complete in Experiment 2, in which the reinforcer was delivered directly into the oral cavity. However, retraining the response with a different reinforcer partially restored the ability of the Sd to control performance of that response. These results suggest that an Sd may not augment its trained responses when the reinforcer has been completely devalued but may promote responses with which it shares a reinforcer, as long as those responses are associated with some reinforcer that retains its value. The implications of these results for the way that discriminative stimuli control instrumental behavior are discussed.
Chapter
An overview of today's diverse theoretical and methodological approaches to action and the relationship of action and cognition. The emerging field of action science is characterized by a diversity of theoretical and methodological approaches that share the basic functional belief that evolution has optimized cognitive systems to serve the demands of action. This book brings together the constitutive approaches of action science in a single source, covering the relation of action to such cognitive functions as perception, attention, memory, and volition. Each chapter offers a tutorial-like description of a major line of inquiry, written by a leading scientist in the field. Taken together, the chapters reflect a dynamic and rapidly growing field and provide a forum for comparison and possible integration of approaches. After discussing core questions about how actions are controlled and learned, the book considers ecological approaches to action science; neurocogntive approaches to action understanding and attention; developmental approaches to action science; social actions, including imitation and joint action; and the relationships between action and the conceptual system (grounded cognition) and between volition and action. An emerging discipline depends on a rich and multifaceted supply of theoretical and methodological approaches. The diversity of perspectives offered in this book will serve as a guide for future explorations in action science. ContributorsLawrence W. Barsalou, Miriam Beisert, Valerian Chambon, Thomas Goschke, Patrick Haggard, Arvid Herwig, Herbert Heuer, Cecilia Heyes, Bernhard Hommel, Glyn W. Humphreys, Richard B. Ivry, Markus Kiefer, Günther Knoblich, Sally A. Linkenauger, Janeen D. Loehr, Peter J. Marshall, Andrew N. Meltzoff, Wolfgang Prinz, Dennis R. Proffitt, Giacomo Rizzolatti, David A. Rosenbaum, Natalie Sebanz, Corrado Sinigaglia, Sandra Sülzenbrück, Jordan A. Taylor, Michael T. Turvey, Claes von Hofsten, Rebecca A. Williamson
Article
This review applies some new experimental findings and theoretical ideas about how reinforcers act on the neural mechanisms of learning and memory to the problem of how addictive drugs affect behaviour. A basic assumption of this analysis is that all changes in behaviour, including those involved in drug addiction and the initiation of drug self-administration, require the storage of new information in the nervous system. Animal studies suggest that such information is processed in several (this review deals with three) more or less independent learning and memory systems in the mammalian brain. Reinforcers can interact with these systems in three ways: they activate neural substrates of observable approach or escape responses, they produce unobservable internal states that can be perceived as rewarding or aversive, and they modulate or enhance the information stored in each of the memory systems. It is suggested that each addictive drug maintains its own self-administration by mimicking some subset of these actions. Evidence supporting the notion of multiple memory systems and data on the actions of several drugs (amphetamine, cocaine, nicotine, alcohol and morphine) on these systems are briefly reviewed. The utility of the concept of ''reward'' for understanding the effects of drugs on behaviour is discussed. Evidence demonstrating actions of drugs on multiple neural substrates of reinforcement suggests that no single factor is likely to explain either addictive behaviour in general or self-administration in particular. Some of the findings on the development and maintenance of self-administration by animals of the five exemplar drugs are discussed in the context of these ideas.
Chapter
Outcome Expectancies and Efficacy ExpectationsConclusions References
Article
This paper presents a biopsychological theory of drug addiction, the 'Incentive-Sensitization Theory'. The theory addresses three fundamental questions. The first is: why do addicts crave drugs? That is, what is the psychological and neurobiological basis of drug craving? The second is: why does drug craving persist even after long periods of abstinence? The third is whether 'wanting' drugs (drug craving) is attributable to 'liking' drugs (to the subjective pleasurable effects of drugs)? The theory posits the following. (1) Addictive drugs share the ability to enhance mesotelencephalic dopamine neurotransmission. (2) One psychological function of this neural system is to attribute 'incentive salience' to the perception and mental representation of events associated with activation of the system. Incentive salience is a psychological process that transforms the perception of stimuli, imbuing them with salience, making them attractive, 'wanted', incentive stimuli. (3) In some individuals the repeated use of addictive drugs produces incremental neuroadaptations in this neural system, rendering it increasingly and perhaps permanently, hypersensitive ('sensitized') to drugs and drug-associated stimuli. The sensitization of dopamine systems is gated by associative learning, which causes excessive incentive salience to be attributed to the act of drug taking and to stimuli associated with drug taking. It is specifically the sensitization of incentive salience, therefore, that transforms ordinary 'wanting' into excessive drug craving. (4) It is further proposed that sensitization of the neural systems responsible for incentive salience ('for wanting') can occur independently of changes in neural systems that mediate the subjective pleasurable effects of drugs (drug 'liking') and of neural systems that mediate withdrawal. Thus, sensitization of incentive salience can produce addictive behavior (compulsive drug seeking and drug taking) even if the expectation of drug pleasure or the aversive properties of withdrawal are diminished and even in the face of strong disincentives, including the loss of reputation, job, home and family. We review evidence for this view of addiction and discuss its implications for understanding the psychology and neurobiology of addiction.