Associative learning mechanisms underpinning the transition from

advertisement
Associative learning mechanisms underpinning the transition from recreational
drug use to addiction
Lee Hogartha, Bernard W Balleineb, Laura H Corbitc
Simon Killcrossa
a. School of Psychology, School of Psychology, University of New South Wales,
Sydney, NSW, 2052, Australia.
b. Brain & Mind Research Institute, University of Sydney, 100 Mallet St,
Camperdown, Sydney, NSW, 2050, Australia
c. Laura H. Corbit, School of Psychology, Brennan MacCallum Building,
University of Sydney, Sydney, NSW, 2006, Australia.
Author for correspondence:
Lee Hogarth
School of Psychology
University of New South Wales
Sydney
NSW 2052
Australia.
Phone: +61 (0)2 93853038
Fax: +61 (0) 9385 3641
Email: l.hogarth@unsw.edu.au
Short title: Abnormal learning underpinning dependence
Acknowledgements: This work was supported by MRC grant #G0701456 to LH,
NHMRC grant #633268 to BB&SK, and NHMRC grant #568872 to SK&BB
1
Abstract
Learning theory proposes that drug-seeking is a synthesis of multiple controllers.
Whereas goal-directed drug-seeking is determined by the anticipated incentive value of
the drug, habitual drug-seeking is elicited by stimuli which have formed a direct
association with the response. Moreover, drug-paired stimuli can transfer control over
separately trained drug-seeking responses by retrieving an expectation of the drug’s
identity (specific transfer) or incentive value (general transfer). This review covers
outcome devaluation and transfer of stimulus-control procedures in humans and animals
which isolate the differential governance of drug-seeking by these four controllers
following various degree of contingent and noncontingent drug exposure. The neural
mechanisms underpinning these four controllers are also reviewed. These studies suggest
that although initial drug use is goal-directed, chronic drug exposure confers a
progressive loss of control over action selection by specific outcome representations
(impaired outcome-devaluation and specific transfer), and a concomitant increase in
control over action selection by antecedent stimuli (enhanced habit and general transfer).
The prefrontal cortex and mediodorsal thalamus may play a role in this drug induced
transition to behavioural autonomy.
Key words: addiction; learning theory; goal; cue-reactivity; habit.
2
Introduction
A recurrent theme in addiction theory is that drug-seeking has multiple determinants.
Wikler1 argued that the euphoric effects of the drug maintained initial drug use whereas
addiction itself stemmed from the emergence of a withdrawal syndrome. Tolerance2 and
opponent-process theories3 elaborated this notion of a shift from positive to negative
reinforcement. Subsequently, however, the importance of negative reinforcement was
questioned by the observation that drug self-administration engages dopamine, the brain
substrate of reward,4 and by the lawful relationship between the frequency of drugseeking and the magnitude of drug reward.5 But by denying the importance of negative
reinforcement (but see 6), positive reinforcement theorists were put at pains to explain the
transition between recreational drug use and addiction. Tiffany7 answered this question
from a cognitive viewpoint, arguing that drug-seeking may be mediated by desire, or
elicited automatically by drug cues, and the latter controller predominates in addiction.
Robinson and Berridge8 made a similar argument from a behavioural neuroscience
perspective, stating that drug-seeking may be driven by hedonic anticipation of the drug
(liking), or autonomous cue-locked conditioned behaviour (wanting), thus accounting for
addicts’ paradoxical continuation of drug use despite their declared desire to quit.
Contemporary addiction theories have elaborated these themes. The behavioural
economists have garnered evidence that human drug dependence is a choice recruited by
the reinforcement value of the drug,9 but is also accompanied by an inability to utilise
knowledge of abstract future consequences in decision making.10 Similarly, animal
learning theorists have substantiated evidence that drug self-administration is a function
3
of the reinforcement value of the drug11, 12 but also undergoes a transition to automatic
control by drug paired stimuli.13 Finally, cognitive neuroscientists have shown that drug
liking is associated with drug induced dopamine activation14 and that clinically diagnosed
addiction is accompanied by hypofrontality and executive dysfunction.15 The common
theme in all of these frameworks, therefore, is that initial drug use is mediated by the
drug acting as a positive reinforcer, whereas the transition to clinical dependence is
linked to a loss of intentional regulation and concomitant emergence of automatic control
over drug-seeking.
Learning theory and addiction
The current review aims to detail this transitional theory of addiction by inspecting
human and animal learning research which has tested the differential governance of
behaviour at various stages of drug exposure. The ideas developed here were first
introduced by Norman White who drew a link between the role of the striatum in
memory and addictive behaviour16, 17. The formal associative learning account was then
outlined by Anthony Dickinson during symposium proceedings from empirical work with
natural rewards18. These ideas were then translated to behavioural neuroscience research
with addictive drugs in collaboration with Trevor Robbins and Barry Everitt
19-21
.
Simultaneously, behavioural neuroscience research continued with natural rewards which
clarified the associative mechanisms outlined here22-24, and which are depicted
schematically in Figure 1. According to this perspective, experience of the drug outcome
is encoded separately in terms of its specific sensory correlates or perceptual identity (Oi)
and its consummatory, post-ingestive or incentive value (Ov), and these two
4
representations of the drug can differentially enter into associations.25,
26
As a
consequence, the agent (person or animal) acquires four forms of associative knowledge.
(1) Goal-directed learning. The agent acquires knowledge of the instrumental
contingency between the drug-seeking response and the drug’s identity and value (R-Oiv).
Moreover, the representation of the drug’s value is updated by internal states, such as
deprivation or satiety, which predict the experienced value of the drug. Consequently,
retrieval of the representation of the drug and its current value (Oiv) determines the
propensity to select the associated drug-seeking responses from amongst competing
outcome choices based on a comparison of their relative values.27 Thus, a higher value
drug produces a greater proportion of intentional choice of that outcome from amongst
alternative rewards.28
(2) Habit learning. The agent forms an association between external stimuli (S) and the
drug-seeking response (R) in proportion to the contingent co-occurrence of these two
events prior to drug reinforcement and the reinforcement value of that outcome (Ov).29
This S-R/reinforcement process enables the drug stimulus, when reencountered, to elicit
the drug-seeking response directly without retrieving any representation of the drug
outcome. Such habitual drug-seeking accords with the clinical characterisation of
addiction as reflecting a loss of intentional regulation of behaviour.
(3) Specific transfer. External stimuli also acquire an association with the drug outcome
in accordance with the predictive contingency between these events, enabling stimuli to
retrieve a representation of the drug’s identity and/or value. Retrieval of the outcome’s
identity (S-Oi) can, in turn, elicit separately trained instrumental responses that are
5
associated with that same outcome via a bidirectional O-R, or ideomotor, connection (SOi-R).30
(4) General transfer. By contrast, retrieval of the outcome’s affective value (S-Ov) elicits
a motivational state akin to the drug itself, which exerts a general excitatory effect on
prevailing responses controlled by the other associations ((S-Ov)-R).31
The claim made in the current paper is that these various forms of behavioural control
interact to determine the propensity to engage in drug-seeking at any given moment. Our
claim is that continuing drug exposure impairs retrieval or utilisation of the representation
of specific outcome identities (Oi), thus impairing control of action by knowledge of
specific outcome (R-Oiv and S-Oi-R) towards the more general control of actions by
antecedent stimuli (S-R and (S-Ov)-R). We now turn to empirical evidence for this
psychological account of addiction.
[Insert Figure 1 here]
1. Goal-directed drug-seeking
The outcome devaluation procedure provides the principal method for identifying goaldirected control.32 A version of this procedure is presented in Table 1. In this procedure,
rats learn that two different lever press responses (R) produce different rewarding
outcomes (O). For example, one lever may produce drug reward such as alcohol or
cocaine (O1) whereas the other lever produces an alternative natural reward such as
sucrose (O2). The drug is then devalued by pairing it with lithium chloride induced
gastrointestinal sickness, specific satiety, or related manipulation, such that the value of
6
the drug is diminished. The critical test then comes when the animal is again given the
opportunity to press the two levers in an extinction test where the responses no longer
produce their respective rewards. The question at stake is whether the animal will reduce
responding for the drug outcome (R1<R2). Because the outcomes are not presented in the
extinction test, any such devaluation effect cannot be attributed to S-R/reinforcement
(habit) learning, that is, by experience of the drug outcome modulating the capacity of
contextual cues to elicit drug-seeking response. Furthermore, because the procedure
contains no stimuli that differentially signal the two outcomes, a devaluation effect
cannot be attributed to a change in capacity of such cues to elicit responding for their
associated outcomes (S-Oiv-R). Instead, any reduction in drug choice in the extinction test
must be mediated by animals’ integration of knowledge of the R-Oiv contingencies
acquired during instrumental training, with knowledge of current low value of the drug
outcome (Ov) acquired during the devaluation treatment, which together determine the
propensity to select that response. In other words, a devaluation effect in the extinction
test demonstrates that drug-seeking is goal-directed in that it is determined by the
anticipated reward value of the drug.
[Insert table 1 here]
Two studies illustrate the outcome-devaluation procedure in demonstrating goal-directed
control of drug-seeking. In the study by Olmstead et al.33, rats were trained on a seekingtaking chain in which they had to press a seeking lever to gain access to a taking lever,
which in turn delivered intravenous cocaine. To test whether the seeking response was
goal-directed, the taking lever was extinguished by terminating cocaine delivery. The
7
seeking lever was not present during this extinction training. The fact that this extinction
training led to an immediate reduction in rats’ performance of the seeking response in
extinction indicated that this response was mediated by knowledge of its consequences,
i.e. the low current value of the taking lever.
Hutcheson et al.34 employed a similar design. Training on a seeking-taking chain for
heroin was followed by a revaluation treatment in which self-administration via the
taking response was experienced in a withdrawal state to establish the high value of
heroin in this state. Rats were then again given access to the seeking lever in extinction,
and the finding that withdrawal produced an increase in performance of the seeking
response indicated that it was goal-directed in that is was mediated by knowledge of the
current high value of the heroin outcome.
The outcome-devaluation procedure has also been modified for humans.35,
36
In the
concurrent training stage of these experiments, mainly student smokers learned two key
press responses, where R1 produced tobacco points and R2 produced chocolate points.
Tobacco was then devalued by smoking to satiety or evaluation of smoking health
warnings, e.g. ‘smoking causes cancer’36, or by administration of nicotine nasal spray.35
The finding that tobacco choice in the extinction test was sensitive to these devaluation
treatments (R1<R2) indicated that it was goal-directed in being mediated by knowledge
of the current value of the drug outcome.
A key observation replicated in these human experiments was that individual variation in
level of tobacco dependence was associated with a preferential selection of the tobacco
over the chocolate response. Similar preferences have been established in animals11 and
8
human cocaine users37 and confirms the economic theorists’ main contention that drug
dependence reflects individual differences in the reinforcement value of the drug9. The
outcome-devaluation procedure qualifies this notion by distinguishing the contribution of
goal-directed (R-Oiv) and habitual (S-R) drug-seeking to this drug preference. We know
that choice of the drug-seeking response was goal-directed, as it was sensitive to
devaluation in the extinction test. Any residual contribution of S-R learning to this drug
preference would be marked by variation in sensitivity to devaluation treatment in the
extinction test. As there was no systematic variation across levels of nicotine dependence
in sensitivity to devaluation, it may be concluded that preferential tobacco choice was
mediated entirely by valuation of the drug as a goal, and not by differential S-R
formation. The conclusion, therefore, is that drug-seeking within these parameters is
goal-directed, and that level of dependence, at least at this early stage of drug exposure,
reflects the valuation of the drug as a specific goal (see 38-42).
2. Habitual drug-seeking
As noted, the outcome devaluation procedure can evaluate the habitual status of
instrumental performance32 (see Table 1). Whereas sensitivity of drug-seeking to
devaluation in the extinction test (R1<R2) signifies goal-directed control, insensitivity to
devaluation in the extinction test (R1=R2) demonstrates that retrieval of the current value
of the drug plays no role in drug-seeking. Instead, drug-seeking is deemed to have
become habitual, being elicited by contextual stimuli which have acquired a direct S-R
association with drug-seeking during instrumental training, without retrieving a
representation of current value of the drug.
9
Two studies illustrate the use of the outcome-devaluation procedure to demonstrate the
habitual status of drug-seeking. In the first study, Dickinson et al.13 trained rats to acquire
two instrumental responses, one for alcohol and one for food pellets, before one of these
outcomes was devalued by pairing it with lithium chloride induced sickness. When the
rats were again given the opportunity to respond for these outcomes in extinction, it was
found that performance of the food-seeking response was reduced by the devaluation
treatment, indicating that food-seeking was goal-directed. By contrast, performance of
the alcohol-seeking response was insensitive to devaluation suggesting that alcoholseeking had become an S-R habit. A second study used a similar design to confirm that
cocaine-seeking was similarly prone to habitual control compared to natural rewardseeking.43
A question arises as to why habitual drug-seeking was established by these two
procedures13, 43, whereas goal-directed drug-seeking was found in the earlier designs.33-36
In explaining these divergent results, one might appeal to a number of variables that have
been demonstrated to modulate the balance between goal-directed and habitual control,
including position of the response within an instrumental sequence or chain,44-46 amount
of training,47,
48
number of available responses49 and/or reinforcement value of the
outcome.50 The important point made by this literature, is that goal-directed and habitual
actions exist in a dynamic balance that can be biased in one direction or the other by
conditions of training or testing that favour acquisition/expression of the R-O versus S-R
association. Our basic argument is that within this complex system, drugs exert a constant
10
pressure in favour of the S-R association by impairing retrieval or utilisation of the
specific identity of outcomes.
Corbit et al.51 has recently mapped the progressive transition to habitual control of drugseeking with extended training. In this study, rats acquired a self-administration response
for alcohol, before alcohol was devalued by ad libitum consumption (satiety). Alcoholseeking was then tested in extinction to evaluate goal-directed control of this behaviour.
The important result was that following two weeks of self-administration training, the
response remained sensitive to devaluation, but by eight weeks of training, the response
was insensitive to devaluation, suggesting a transition from goal-directed to habitual
control had occurred with training (cocaine-seeking shows a similar transition to habit
with extended training52). An important additional finding of this study was that
noncontingent administration of alcohol was sufficient to accelerate habitual control over
natural reward seeking responses. Thus, not only do drug self-administration responses
become habitual, but non-contingent drug exposure renders contemporaneously acquired
naturally rewarded instrumental actions habitual.
In humans, a comparable effect of noncontingent alcohol exposure on habitual control
has recently been demonstrated.53 Participants were administered with 0.4 g/kg of alcohol
or placebo before instrumental training with R1 and R2 for chocolate and water
respectively. Chocolate was then devalued by ad libitum consumption before choice
between R1 and R2 was tested in extinction. The finding that alcohol attenuated goaldirected control over chocolate choice in the extinction test supports the translational
11
relevance of animal models, and suggests that accelerated habit learning can be
demonstrated with acute drug dosing.
A key study by Nelson and Killcross54 has revealed that non-contingent drug exposure
enhanced habit formation via an effect at instrumental training rather than at the
extinction test. They pre-exposed rats to amphetamine for 7 days before a 7 day injectionfree period. Instrumental training for sucrose was then undertaken before this outcome
was devalued by specific satiety or lithium chloride induced sickness. The results from
the extinction test indicated that both devaluation treatments failed to modify sucroseseeking in the amphetamine exposed rats suggesting this response had become habitual,
whereas placebo rats showed goal-directed control (see also50, 55). Importantly, chronic
amphetamine only accelerated habit formation if administered prior to instrumental
training, but not if administered after training. Consistent with this, all the
aforementioned studies which have shown effects of contingent13,
43,
51
and
noncontingent51 drug exposure on habit learning have undertaken drug administration
contemporaneously with instrumental training. The implication, therefore, is that during
instrumental training the ability of the outcome representation to enter into new learning
may be impaired by drug exposure, favouring acquisition of the S-R over the R-Oiv
contingencies, but once R-Oiv learning is acquired drug-free, deployment of this
knowledge at test is not impaired by drug exposure.
In reconciling the aforementioned studies, one can propose a transitional model of
addiction wherein initial drug-seeking is goal-directed,33-36 but following extended
training comes under habitual S-R control, 13, 43 and contemporaneously acquired natural
12
reward seeking also comes under habitual control.51,
53, 54
Ultimately, the agent’s
behavioural repertoire comes to be dominated by S-R habits.
3. Specific transfer of stimulus control over drug-seeking
The Pavlovian to instrumental transfer procedure is the principal method for
demonstrating control over responding by stimuli retrieving a representation of the
specific identity of the outcome (Oi) (e.g.56 – see table 2). In this design, rats are given
Pavlovian training in which one stimulus (S1) signals drug availability (O1), whereas a
second stimulus (S2) signals the availability of an alternative reward, for example sucrose
(O2). Separate instrumental training is then undertaken wherein rats learn that one lever
produces the drug (O1) whereas the other lever produces sucrose (O2). Finally, in the
Pavlovian to instrumental transfer test, each stimulus is presented for the first time while
the two instrumental responses are available in extinction. The question at stake is
whether each stimulus will enhance performance of the response with which it shares the
same outcome (i.e. S1:R1>R2, S2:R1<R2). Such an outcome-specific transfer effect
demonstrates that each stimulus retrieved a representation of its associated outcome,
which in turn retrieved the response that was associated with that outcome (S-Oi-R). The
effect cannot be attributed to the formation of an S-R association because the stimuli and
the responses were never contingently reinforced during training, and furthermore,
because the transfer test was conducted in extinction, so no S-R association can form
across that period either.
[Insert table 2 here]
13
There is currently only one demonstration of outcome-specific transfer of stimulus
control over drug-seeking per se57 (although there are many demonstrations in natural
reward learning58). In this study, mainly student smokers first learned that two arbitrary
stimuli (S1 and S2) predicted tobacco points or money, respectively, before learning that
two responses (R1 and R2) earned tobacco points and money respectively. In the transfer
test, the two stimuli were found to selectively enhance performance of the response that
had earned the same outcome. Thus, each stimulus must have retrieved a representation
of its associated outcome (points) which in turn elicited the response that had produced
that outcome (S-Oi-R).
4. General transfer of stimulus control over drug-seeking
By contrast, in a related animal procedure, Corbit and Janak59 paired S1 and S2 with
ethanol or sucrose, respectively, and then trained R1 and R2 with these same outcomes,
respectively. The results showed that the ethanol stimulus enhanced the rate of both R1
and R2 equally above a no-stimulus baseline, indicating that this stimulus exerted a
general excitatory effect on instrumental reward-seeking by retrieving the value (S-Ov)
rather than identity (S-Oi) of the outcome. By contrast, the sucrose stimulus produced a
specific transfer effect, selectively enhancing instrumental responding for sucrose over
ethanol, indicating that it had retrieved the outcome’s identity (S-Oi). These data are
consistent with the view that drug associated cues favour general facilitatory effects on
appetitive instrumental responses compared to natural reward-paired cues (see also60, 61).
The divergent results of these human and animal transfer studies may be resolved by
appealing to Konorski’s view25 that outcomes are encoded separately in terms of their
14
perceptual identity (sensory correlates) and consummatory or incentive value.26 On this
view, the tobacco points outcome utilised by Hogarth et al.57 was largely perceptual and
minimally consummatory, and so the stimulus paired with this outcome favoured a
specific transfer effect which relied on the retrieval of this outcome’s perceptual identity
(S-Oi-R). By contrast the ethanol consummatory outcome employed by Corbit and
Janak59 possessed a substantial pharmacological/consummatory effect, and so the
stimulus paired with this event favoured a general motivational enhancement based upon
retrieval of the outcome value ((S-Ov)-R).
Other studies substantiate this characterisation of the specific and general forms of
stimulus control.62 First, the magnitude of the specific transfer effect is determined by the
reliability of the S-O contingency in training,63-65 but is insensitive to outcome
devaluation.31, 66-68 Importantly, specific transfer effects by drug cues on drug-seeking in
humans are similarly insensitive to devaluation achieved by drug satiety, health
warnings36,
69
and pharmacotherapy.35 Moreover, the finding that drug cue effects on
subjective craving70,
71
and drug-taking69 are similarly autonomous of devaluation by
satiety and pharmacotherapy, supports the validity of specific (S-Oi-R) transfer effects in
addiction. By contrast, general transfer effects are modulated by devaluation of the
outcome,31, 72 and cross over to other reinforcers of the same hedonic category.73 Thus,
general transfer effects are deemed to be mediated by the stimulus retrieving a
representation of the current value (Ov) but not identity (Oi) of the outcome, and as a
consequence, the effect is sensitive to changes in motivational state but is not response
selective ((S-Ov)-R).
15
Not only does contingent drug exposure cause drug cues to favour general over specific
transfer59, but noncontingent drug exposure may also cause natural reward cues to
undergo this same transition. In a recent study, Shiflett et al.74 found that noncontingent
exposure to chronic amphetamine administered following Pavlovian and instrumental
training, that is, prior to the transfer test, abolished the specific transfer effect and
enhanced the general transfer effect. Specifically, rats received Pavlovian training in
which S1 predicted chocolate and S2 predicted grain. Instrumental training was then
undertaken in which two responses, R1 and R2, earned these same outcomes,
respectively. Then, half of rats were given 7 days of amphetamine administrations and
the remainder placebo (akin to 54). Finally, in the transfer test the two stimuli were tested
for a specific transfer effect in which each stimulus selectively enhanced responding for
the same outcome, or a general transfer effect in which each stimulus enhanced
responding for both outcomes equally above a pre-stimulus baseline. The remarkable
finding was that amphetamine exposure prior to test abolished the specific transfer effect
and enhanced the general transfer effect. A similar enhancement of the general transfer
effect produced by natural reward cues on reward-seeking has been found following
acute75 and chronic76 amphetamine administered prior to the testing phase, although in
these latter studies no was made attempt to assess specific transfer.
Overall, these studies favour a transitional model wherein early in training, drug cues
retrieve the drug’s identity and thus produce specific transfer effects.35,
36, 57
Extended
drug exposure, however, causes stimuli to lose contact with drug’s identity and instead
make contact with the drug’s value, thus causing a transition from specific to general
16
transfer.59-61 Moreover, contemporaneously acquired natural reward cues also shift
contact from their outcome’s identity to its value, causing a comparable transition from
specific to general transfer.61, 74-76
Synthesis of psychological studies
The transition to behavioural autonomy depicted across the studies reported here is
consistent with a singular impairment in the capacity to retrieve or utilise the specific
identity of outcomes as a consequence of drug exposure. This impairment can explain
why drug-seeking is initially goal-directed (R-Oiv) and under specific stimulus control (SOi-R), but then becomes habitual (S-R) and under general stimulus control ((S-Ov)-R).
Whereas the former two controllers require a representation of the specific identity of the
drug, the latter two controllers do not. Moreover, the finding of the same transition in
natural reward-seeking responses acquired contemporaneously with drug exposure
suggests that the impairment in capacity to represent the specific outcomes applies to the
entire class of appetitive rewards (it remains to be seen whether aversive outcomes are
similarly affected). Finally, this account suggests that stress77, trait impulsivity,78,
conflict,80,
81
hypofrontality82-84 and schizophrenia85,
86
79
may be linked with drug
dependence and relapse because they exacerbate this impairment in capacity to
represent/utilise specific outcome identities.
The claim that a single impairment underpins both the loss of goal-directed control and
the loss of specific transfer is challenged by a dissociation between these two effects.
Specifically, goal-directed control is abolished by chronic amphetamine administered
prior to training, but not administered prior to test, suggesting that chronic amphetamine
17
impairs acquisition of response-outcome knowledge during instrumental training but does
not directly impair the retrieval/utilisation of outcome identify required for goal-direction
control at test.54 By contrast, chronic amphetamine can abolish specific transfer when
administered prior to test74 suggesting that chronic amphetamine can directly abolish
retrieval/utilisation of outcome identity required for the specific transfer effect.
Identifying a common learning mechanism that operates during both instrumental
training and the transfer test to produce the observed transition to behavioural autonomy
is arguably crucial for isolating the core psychological pathology in addiction.
Neural basis of action control
The following section reviews animal studies which have examined the neural basis of
the four controllers underpinning natural reward and drug-seeking. The purpose of this
section is to identify substrates upon which chronic drug exposure might act to produce
the transition to autonomy depicted above, i.e. reduce goal-directed learning and specific
transfer, and/or enhance habit learning and general transfer.
1. Neural basis of goal-directed action
Lesions of the prelimbic (PL) region of the prefrontal cortex have been shown to produce
precisely the same deficit in goal-directed control as chronic amphetamine.54 That is,
lesions of the PL abolish goal-directed control of natural reward-seeking if they occur
prior to instrumental training48,
87, 88
but not if they occur prior to test.89 Comparable
effects have been found following lesions of the mediodorsal thalamus, which also
abolish acquisition90 but not expression91 of goal-directed action. As the mediodorsal
thalamus provides the major thalamic input to the PL it is believed that these two regions
18
form a functional circuit. The correspondence of PL, mediodorsal thalamic lesions and
chronic amphetamine exposure on acquisition of goal-directed control supports these two
brain regions in mediating the effect of drug exposure on transition to behavioural
autonomy.
By contrast, the dorsomedial striatum (DMS) has been shown to be essential for both
acquisition92,
93
and expression94 of goal-directed learning. Importantly, post-training
DMS inactivation has been shown to abolish goal-directed control of alcohol-seeking,
suggesting common control mechanisms underpinning both natural and drug reward
goal-directed learning51. Additionally, lesions of the basolateral amygdala (BLA) abolish
sensitivity to outcome-devaluation whether given before95,
training.91,
97
96
or after instrumental
Thus the DMS and BLA, in failing to mimic the selective effect of
amphetamine on loss of goal-directed learning at acquisition, may not play a direct role in
drug sensitization-induced transition to behavioural autonomy.
2. Neural basis of habitual action
Habitual action, by contrast, is mediated by the dorsolateral striatum (DLS) and
infralimbic cortex. As noted earlier, overtraining instrumental contingencies favours a
transition from R-Oiv to S-R control, that is, progressive loss of sensitivity to devaluation
in the extinction test.47 However, rats with lesions to the DLS either pre- or post-training
fail to develop habitual control and remain sensitive to devaluation irrespective of
training, indicating that the DLS is required for the acquisition and expression habit
learning.51, 98, 99 Importantly, post-training inactivation of the DLS has also been shown to
abolish expression of habitual cocaine-seeking52 and alcohol-seeking51 following
19
extended training, rendering these behaviours once again goal-directed, and confirming
the common control mechanisms underpinning both natural and drug reward habit
learning. Additionally, pre-training functional disconnection of DLS and the amygdala
central nucleus CN has also recently been shown to abolish habitual control of action and
restore goal-directed control29. Finally, lesions of the infralimbic cortex made prior to
instrumental training abolish the transition to habit following overtraining.48 Thus,
chronic drug exposure might act on any of these regions to promote the dominance over
behaviour by S-R habits.
3. Neural basis of specific transfer of stimulus control
The ability of stimuli to transfer selective control over separately trained instrumental
responding for the same outcome is abolished by pre-training62 and post-training lesions
of the orbitofrontal cortex (OFC),100 by pre-training lesions and post-training inactivation
of the nucleus accumbens (NAC) shell,101, 102 by pre-training96, 103, 104 and post-training91
lesions of the BLA, and pre-training functional disconnection between these two
structures.105 In addition, post-training inactivation of the DMS106 and post-training
lesions of the mediodorsal thalamus91 also eliminate the selective transfer effect. Thus, in
order to impair specific transfer, chronic drug exposure may act on any of these
structures.
4. Neural basis of general transfer of stimulus control
The ability of conditioned stimuli to produce a general excitatory effect on separately
trained instrumental responses is abolished by post-training inactivation of the DLS,106
post-training inactivation ventral tegmental area (VTA),31,
20
107
pre- or post-training
inactivation of the NAC core,102, 108 and pre-training lesions of the amygdala CN.96 Thus,
chronic drug exposure may influence any of these regions to enhance general transfer
effects.
Synthesis of behavioural neuroscience studies
PL and medial dorsal thalamic lesions show a striking correspondence with chronic drug
exposure in producing behavioural autonomy. Specifically, these lesions abolish
acquisition but not expression of goal-directed control,48, 87-91 which matches exactly the
effect of chronic amphetamine54 (see also51 for related effect with alcohol). However,
lesions to the PL do not modify the specific transfer effect,88 but post-training lesions of
the mediodorsal thalamus do91 matching the impact of chronic amphetamine.74 Thus,
lesions of the mediodorsal thalamus produce precisely the same effect as chronic drug
exposure. It is also noteworthy that lesions of the OFC abolish specific transfer62, 100 but
not outcome-devaluation100 indicating that damage at this region alone could not produce
the exact pattern of chronic drug exposure. Thus, although the effect of chronic drug
exposure on transition to behavioural autonomy could be produced by a combination of
PL and OFC damage – a view strengthened addicts’ hypofunction in these regions109, 110
– damage to the mediodorsal thalamus alone could impair both forms of behaviour
control, and so has the advantage of parsimony.
Conclusion
To conclude, we propose that initial drug-seeking is goal-directed and tracks the
anticipated value of the drug (R-Oiv) and is responsive to specific transfer (S-Oi-R)
effects by drug cues. Chronic drug exposure, however, impairs capacity to retrieve or
21
utilise the specific identity of outcomes, and so produces a transition of behavioural
control from goal-directed learning (R-Oiv) and specific transfer (S-Oi-R) to habit (S-R)
and general transfer ((S-Ov)-R). This transition occurs in relation to both drug outcomes
and natural reward outcome, resulting in a narrowing of the addict’s behavioural
repertoire to general cue excitation of dominant S-R drug habits, with restricted capacity
for intentional selection of alternative actions. This associative framework captures the
cardinal diagnostic characteristics of heightened drug reinforcement, loss of willed
regulation of drug-seeking and restricted engagement with alternative activities. Future
research needs to clarify precisely how this transition to autonomy is accelerated by drugs
of abuse compared to natural rewards, whether by differences in reward value50,
kinetics111, neuroadaptations112, 113 or neurotoxicity114, and precisely how this alters the
balance between corticostriatal circuits underpinning the four controllers.48, 115
There are several implications concerning treatment strategy. Consistent with Tiffany’s7
insight, we have argued that goal-directed action and habit exist in a dynamic balance,
which may be competitive46 or hierarchical45, but switching between the two modes
apparently can occur within the span of a single response sequence and/or longitudinally
with training. If addiction does reflect a progressive weakening of the role of outcome
retrieval/utilisation in the execution of action sequences, allowing drug habits to
dominate, then treatments such as expectancy challenge116 and extinction training117
which arguably work by changing the specific representation of the drug may not provide
the optimal strategy. Instead, treatments that enhance capacity to engage representations
of the future such as working memory training118 combined with provision of alternative
22
reward contingencies119 may be more efficacious in redirecting addicts from their
established habits. Moreover, given that capacity for goal-directed control can be
reinstated by manipulations of brain function48,
51, 52
and by uncertainty which has
definable neural substrates46 suggests that neuropharmacology should complement such
learning approaches to install new intentional action choices.
References
1.
Wikler, A. 1984. Conditioning factors in opiate addiction and relapse. J. Subst.
Abuse Treat. 1: 279-285.
2.
Siegel, S. 1989. Pharmacological conditioning and drug effects. In Psychoactive
Drugs. Tolerance and Sensitisation. Goudie, A. & M. Emmett-Oglesby, Eds.: 115-180.
Humana Press. Clifton, New Jersey.
3.
Solomon, R. L. & J. D. Corbit. 1974. An opponent-process theory of motivation:
I. Temporal dynamics of affect. Psychol Rev. 81: 119-145.
4.
Stewart, J., H. de Wit & R. Eikelboom. 1984. Role of conditioned and
unconditioned drug effects in self-administration of opiates and stimulants. Psychol Rev.
63: 251-268.
5.
Bickel, W. K., et al. 1991. Behavioral economics of drug self-administration: II.
A unit-price analysis of cigarette smoking. J. Exp. Anal. Behav. 55: 145-154.
6.
Koob, G. F. & M. Le Moal. 2001. Drug addiction, dysregulation of reward, and
allostasis. Neuropsychopharmacology. 24: 97-129.
7.
Tiffany, S. T. 1990. A cognitive model of drug urges and drug-use behaviour:
Role of automatic and nonautomatic processes. Psychol Rev. 97: 147-168.
8.
Robinson, T. E. & K. C. Berridge. 1993. The neural basis of drug craving: an
incentive-sensitization theory of drug addiction. Brain Res. Rev. 18: 247-291.
9.
Heyman, G. M. 2009. Addiction: A disorder of choice. Harvard University Press.
Cambridge Massachusetts.
10.
MacKillop, J., et al. 2011. Delayed reward discounting and addictive behavior: a
meta-analysis. Psychopharmacology. 216: 305-321.
11.
Ahmed, S. H. 2010. Validation crisis in animal models of drug addiction: Beyond
non-disordered drug use toward drug addiction. Neurosci. Biobehav. Rev. 35: 172-184.
12.
Ahmed, S. H. & G. F. Koob. 1998. Transition from moderate to excessive drug
intake: Change in hedonic set point. Science. 282: 298-300.
13.
Dickinson, A., N. Wood & J. W. Smith. 2002. Alcohol seeking by rats: Action or
habit? Q J Exp Psychol B. 55: 331-348.
14.
Drevets, W. C., et al. 2001. Amphetamine-induced dopamine release in human
ventral striatum correlates with euphoria. Biol Psychiatry. 49: 81-96.
23
15.
Volkow, N. D., J. S. Fowler & G. J. Wang. 2004. The addicted human brain
viewed in the light of imaging studies: brain circuits and treatment strategies.
Neuropharmacology. 47: 3-13.
16.
White, N. M. 1989. A functional hypothesis concerning the striatal matrix and
patches: Mediation of S-R memory and reward. Life Sci. 45: 1943-1957.
17.
White, N. M. 1996. Addictive drugs as reinforcers: multiple partial actions on
memory systems. Addiction. 91: 921-950.
18.
Altman, J., et al. 1996. The biological, social and clinical bases of drug addiction:
commentary and debate. Psychopharmacology. 125: 285-345.
19.
Robbins, T. W. & B. J. Everitt. 1999. Drug addiction: bad habits add up. Nature.
398: 567-570.
20.
Everitt, B. J., A. Dickinson & T. W. Robbins. 2001. The neuropsychological basis
of addictive behaviour. Brain Res. Rev. 36: 129-138.
21.
Everitt, B. J. & T. W. Robbins. 2005. Neural systems of reinforcement for drug
addiction: from actions to habits to compulsion. Nat. Neurosci. 8: 1481-1489.
22.
Balleine, B. W. & S. B. Ostlund. 2007. Still at the choice-point - Action selection
and initiation in instrumental conditioning. In Reward and Decision Making in
Corticobasal Ganglia Networks, Vol. 1104: 147-171.
23.
de Wit, S. & A. Dickinson. 2009. Associative theories of goal-directed behaviour:
a case for animal–human translational models. Psychol Res. 73: 463-476.
24.
Killcross, S. & P. Blundell. 2002. Associative representations of emotionally
significant outcomes. In Emotional cognition: From brain to behaviour. Oaksford, S. C.
M. M., Ed.: 35-73. John Benjamins Publishing Company. Amsterdam, Netherlands.
25.
Konorski, J. 1967. Integrative activity of the brain. University of Chicago Press.
Chicago.
26.
Balleine, B. W. & S. Killcross. 2006. Parallel incentive processing: an integrated
view of amygdala function. Trends Neurosci. 29: 272-279.
27.
Vlaev, I., et al. 2011. Does the brain calculate value? Trends in Cognitive
Sciences. 15: 546-554.
28.
Hursh, S. R. & A. Silberberg. 2008. Economic demand and essential value.
Psychol Rev. 115: 186-198.
29.
Lingawi, N. W. & B. W. Balleine. 2012. Amygdala central nucleus interacts with
dorsolateral striatum to regulate the acquisition of habits. J Neurosci. 32: 1073-1081.
30.
Hommel, B. in press. Ideomotor action control: On the perceptual grounding of
voluntary actions and agents. In Tutorials in action science. Prinz, W., M. Beisert & A.
Herwig, Eds. MIT Press. Cambridge, MA.
31.
Corbit, L. H., P. H. Janak & B. W. Balleine. 2007. General and outcome-specific
forms of Pavlovian-instrumental transfer: the effect of shifts in motivational state and
inactivation of the ventral tegmental area. Eur. J. Neurosci. 26: 3141-3149.
32.
Dickinson, A. 1985. Actions and Habits - the Development of Behavioral
Autonomy. Philosophical Transactions of the Royal Society of London Series BBiological Sciences. 308: 67-78.
33.
Olmstead, M. C., et al. 2001. Cocaine seeking by rats is a goal-directed action.
Behav. Neurosci. 115: 394-402.
24
34.
Hutcheson, D. M., et al. 2001. The role of withdrawal in heroin addiction:
enhances reward or promotes avoidance? Nat. Neurosci. 4: 943-947.
35.
Hogarth, L. 2012. Goal-directed and transfer-cue-elicited drug-seeking are
dissociated by pharmacotherapy: Evidence for independent additive controllers. J. Exp.
Psychol.: Anim. Behav. Processes. (in press).
36.
Hogarth, L. & H. W. Chase. 2011. Parallel goal-directed and habitual control of
human drug-seeking: Implications for dependence vulnerability. J. Exp. Psychol.: Anim.
Behav. Processes. 37: 261-276.
37.
Moeller, S. J., et al. 2009. Enhanced choice for viewing cocaine pictures in
cocaine addiction. Biol Psychiatry. 66: 169-176.
38.
Fergusson, D. M., et al. 2003. Early reactions to cannabis predict later
dependence. Archives of General Psychiatry. 60: 1033-1039.
39.
de Wit, H., E. H. Uhlenhuth & C. E. Johanson. 1986. Individual differences in the
reinforcing and subjective effects of amphetamine and diazepam. Drug Alcohol Depend.
16: 341-360.
40.
Scherrer, J. F., et al. 2009. Subjective effects to cannabis are associated with use,
abuse and dependence after adjusting for genetic and environmental influences. Drug
Alcohol Depend. 105: 76-82.
41.
Stoops, W. W., et al. 2007. The reinforcing, subject-rated, performance, and
cardiovascular effects of d-amphetamine: Influence of sensation-seeking status. Addict.
Behav. 32: 1177-1188.
42.
Pomerleau, O. 1995. Individual differences in sensitivity to nicotine: Implications
for genetic research on nicotine dependence. Behav. Genet. 25: 161-177.
43.
Miles, F. J., B. J. Everitt & A. Dickinson. 2003. Oral cocaine seeking by rats:
Action or habit? Behav. Neurosci. 117: 927-938.
44.
Balleine, B. W., et al. 1995. Motivational control of heterogeneous instrumental
chains. J Exp Psychol Anim Behav Process. 21: 203-217.
45.
Dezfouli, A. & B. W. Balleine. 2012. Habits, action sequences and reinforcement
learning. Eur. J. Neurosci. 35: 1036-1051.
46.
Daw, N. D., Y. Niv & P. Dayan. 2005. Uncertainty-based competition between
prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8: 17041711.
47.
Dickinson, A., et al. 1995. Motivational control after extended instrumental
training. Anim. Learn. Behav. 23: 197-206.
48.
Killcross, S. & E. Coutureau. 2003. Coordination of actions and habits in the
medial prefrontal cortex of rats. Cereb. Cortex. 13: 400-408.
49.
Kosaki, Y. & A. Dickinson. 2010. Choice and contingency in the development of
behavioral autonomy during instrumental conditioning. J. Exp. Psychol.: Anim. Behav.
Processes. 36 334-342.
50.
Nordquist, R. E., et al. 2007. Augmented reinforcer value and accelerated habit
formation after repeated amphetamine treatment. Eur. Neuropsychopharmacol. 17: 532540.
51.
Corbit, L. H., H. Nie & P. H. Janak. in press. Habitual alcohol seeking: time
course and the contribution of subregions of the dorsal striatum. Biol Psychiatry.
25
52.
Zapata, A., V. L. Minney & T. S. Shippenberg. 2010. Shift from goal-directed to
habitual cocaine seeking after prolonged experience in rats. J. Neurosci. 30: 1545715463.
53.
Hogarth, L., et al. 2012. Acute alcohol impairs human goal-directed action.
Biological Psychology. 90: 154-160.
54.
Nelson, A. & S. Killcross. 2006. Amphetamine exposure enhances habit
formation. J. Neurosci. 26: 3805-3812.
55.
Schoenbaum, G. & B. Setlow. 2005. Cocaine makes actions insensitive to
outcomes but not extinction: Implications for altered orbitofrontal-amygdalar function.
Cereb. Cortex. 15: 1162-1169.
56.
Colwill, R. M. & R. A. Rescorla. 1988. Associations between the discriminative
stimulus and the reinforcer in instrumental learning. J Exp Psychol Anim Behav Process.
14: 155-164.
57.
Hogarth, L., et al. 2007. The role of drug expectancy in the control of human drug
seeking. J Exp Psychol Anim Behav Process. 33: 484-496.
58.
Holmes, N. M., A. R. Marchand & E. Coutureau. 2010. Pavlovian to instrumental
transfer: A neurobehavioural perspective. Neurosci. Biobehav. Rev. 34: 1277-1295.
59.
Corbit, L. H. & P. H. Janak. 2007. Ethanol-associated cues produce general
Pavlovian-instrumental transfer. Alcoholism-Clinical and Experimental Research. 31:
766-774.
60.
Krank, M. D. 2003. Pavlovian Conditioning With Ethanol: Sign-Tracking
(Autoshaping), Conditioned Incentive, and Ethanol Self-Administration. Alcoholism:
Clinical and Experimental Research. 27: 1592-1598.
61.
Glasner, S. V., J. B. Overmier & B. W. Balleine. 2005. The role of Pavlovian cues
in alcohol seeking in dependent and nondependent rats. J. Stud. Alcohol. 66: 53-61.
62.
Balleine, B. W., B. K. Leung & S. B. Ostlund. 2011. The orbitofrontal cortex,
predicted value, and choice. Ann. N. Y. Acad. Sci. 1239: 43-50.
63.
Delamater, A. R. 1995. Outcome-selective effects of intertrial reinforcement in
Pavlovian appetitive conditioning with rats. Anim. Learn. Behav. 23: 31-39.
64.
Gámez, A. M. & J. M. Rosas. 2005. Transfer of stimulus control across
instrumental responses is attenuated by extinction in human instrumental conditioning.
International Journal of Psychology & Psychological Therapy. 5: 207-222.
65.
Trick, L., L. Hogarth & T. Duka. 2011. Prediction and uncertainty in human
Pavlovian to instrumental transfer Journal of Experimental Psychology: Learning
Memory and Cognition. 37: 757-765.
66.
Rescorla, R. A. 1994. Transfer of instrumental control mediated by a devalued
outcome. Anim. Learn. Behav. 22: 27-33.
67.
Holland, P. C. 2004. Relations between Pavlovian-instrumental transfer and
reinforcer devaluation. J Exp Psychol Anim Behav Process. 30: 258-258.
68.
Colwill, R. M. & R. A. Rescorla. 1990. Effects of reinforcer devaluation on
discriminative control of instrumental behavior. J Exp Psychol Anim Behav Process. 16:
40-47.
69.
Hogarth, L., A. Dickinson & T. Duka. 2010. The associative basis of cue elicited
drug taking in humans. Psychopharmacology. 208: 337-351.
26
70.
Ferguson, S. G. & S. Shiffman. 2009. The relevance and treatment of cue-induced
cravings in tobacco dependence. J. Subst. Abuse Treat. 36: 235-243.
71.
Hitsman, B., et al. in press. Dissociable effect of acute varenicline on tonic versus
cue-provoked craving in non-treatment motivated heavy smokers. Drug Alcohol Depend.
72.
Dickinson, A. & G. R. Dawson. 1987. Pavlovian processes in the motivational
control of instrumental performance. Q J Exp Psychol B. 39: 201-213.
73.
Mitchell, J. B. & J. Stewart. 1990. Facilitation of sexual behaviors in the male rat
in the presence of stimuli previously paired with systemic injections of morphine.
Pharmacol. Biochem. Behav. 35: 367-372.
74.
Shiflett, M. in press. The effects of amphetamine exposure on outcome-selective
Pavlovian-instrumental transfer in rats. Psychopharmacology1-10.
75.
Wyvell, C. L. & K. C. Berridge. 2000. Intra-accumbens amphetamine increases
the conditioned incentive salience of sucrose reward: Enhancement of reward "wanting"
without enhanced "liking" or response reinforcement. J. Neurosci. 20: 8122-8130.
76.
Wyvell, C. L. & K. C. Berridge. 2001. Incentive sensitization by previous
amphetamine exposure: Increased cue-triggered "wanting" for sucrose reward. J.
Neurosci. 21: 7831-7840.
77.
Schwabe, L., A. Dickinson & O. T. Wolf. 2011. Stress, habits and drug addiction:
A psychoneuroendocrinological perspective. Exp Clin Psychopharmacol. 19: 53-63.
78.
Hogarth, L. 2011. The role of impulsivity in the aetiology of drug dependence:
reward sensitivity versus automaticity. Psychopharmacology. 215: 567-580.
79.
Hogarth, L., H. W. Chase & K. Baess. 2012. Impaired goal-directed behavioural
control in human impulsivity. Q J Exp Psychol. 65: 305-316.
80.
de Wit, S., et al. 2006. Dorsomedial prefrontal cortex resolves response conflict in
rats. J. Neurosci. 26: 5224-5229.
81.
Ostlund, S. B., N. T. Maidment & B. W. Balleine. 2010. Alcohol-paired
contextual cues produce an immediate and selective loss of goal-directed action in rats.
Frontiers in Integrative Neuroscience. 4.
82.
Gillan, C. M., et al. 2011. Disruption in the balance between goal-directed
behavior and habit learning in obsessive-compulsive disorder. Am. J. Psychiatry. 168:
718-726.
83.
Valentin, V., A. Dickinson & J. P. O’Doherty. 2007. Determining the neural
substrates of goal-directed learning in the human brain. J Neurosci. 27: 4019-4026.
84.
Klossek, U. M. H., J. Russell & A. Dickinson. 2008. The control of instrumental
action following outcome devaluation in young children aged between 1 and 4 years. J
Exp Psychol Gen. 137: 39-51.
85.
Haddon, J. E., et al. 2010. Impaired conditional task performance in a high
schizotypy population: Relation to cognitive deficits. The Quarterly Journal of
Experimental Psychology. 64: 1-9.
86.
Barch, D. M. & A. Ceaser. 2012. Cognition in schizophrenia: core psychological
and neural mechanisms. Trends in Cognitive Sciences. 16: 27-34.
87.
Balleine, B. W. & A. Dickinson. 1998. Goal-directed instrumental action:
contingency and incentive learning and their cortical substrates. Neuropharmacology. 37:
407-419.
27
88.
Corbit, L. H. & B. W. Balleine. 2003. The role of prelimbic cortex in instrumental
conditioning. Behav. Brain Res. 146: 145-157.
89.
Ostlund, S. B. & B. W. Balleine. 2005. Lesions of medial prefrontal cortex disrupt
the acquisition but not the expression of goal-directed learning. J. Neurosci. 25: 77637770.
90.
Corbit, L. H., J. L. Muir & B. W. Balleine. 2003. Lesions of mediodorsal
thalamus and anterior thalamic nuclei produce dissociable effects on instrumental
conditioning in rats. Eur. J. Neurosci. 18: 1286-1294.
91.
Ostlund, S. B. & B. W. Balleine. 2008. Differential involvement of the basolateral
amygdala and mediodorsal thalamus in instrumental action selection. J Neurosci. 28:
4398-4405.
92.
Yin, H. H., B. J. Knowlton & B. W. Balleine. 2005. Blockade of NMDA
receptors in the dorsomedial striatum prevents action-outcome learning in instrumental
conditioning. Eur. J. Neurosci. 22: 505-512.
93.
Corbit, L. H. & P. H. Janak. 2010. Posterior dorsomedial striatum is critical for
both selective instrumental and Pavlovian reward learning. Eur. J. Neurosci. 31: 13121321.
94.
Yin, H. H., et al. 2005. The role of the dorsomedial striatum in instrumental
conditioning. Eur. J. Neurosci. 22: 513-523.
95.
Balleine, B. W., A. S. Killcross & A. Dickinson. 2003. The effect of lesions of the
basolateral amygdala on instrumental conditioning. J. Neurosci. 23: 666-675.
96.
Corbit, L. H. & B. W. Balleine. 2005. Double dissociation of basolateral and
central amygdala lesions on the general and outcome-specific forms of pavlovianinstrumental transfer. J. Neurosci. 25: 962-970.
97.
Johnson, A. W., M. Gallagher & P. C. Holland. 2009. The basolateral amygdala is
critical to the expression of Pavlovian and instrumental outcome-specific reinforcer
devaluation effects. J Neurosci. 29: 696-704.
98.
Yin, H. H., B. J. Knowlton & B. W. Balleine. 2004. Lesions of dorsolateral
striatum preserve outcome expectancy but disrupt habit formation in instrumental
learning. Eur. J. Neurosci. 19: 181-189.
99.
Yin, H. H., B. J. Knowlton & B. W. Balleine. 2006. Inactivation of dorsolateral
striatum enhances sensitivity to changes in the action-outcome contingency in
instrumental conditioning. Behav. Brain Res. 166: 189-196.
100. Ostlund, S. B. & B. W. Balleine. 2007. Orbitofrontal cortex mediates outcome
encoding in pavlovian but not instrumental conditioning. J. Neurosci. 27: 4819-4825.
101. Corbit, L. H., J. L. Muir & B. W. Balleine. 2001. The role of the nucleus
accumbens in instrumental conditioning: Evidence of a functional dissociation between
accumbens core and shell. J. Neurosci. 21: 3251-3260.
102. Corbit, L. & B. Balleine. 2011. The general and outcome-specific forms of
Pavlovian-Instrumental transfer are differentially mediated by the nucleus accumbens
core and shell. J Neurosci. 31: 11786-11794.
103. Blundell, P., G. Hall & S. Killcross. 2001. Lesions of the basolateral amygdala
disrupt selective aspects of reinforcer representation in rats. J. Neurosci. 21: 9018-9026.
28
104. Holland, P. C. & M. Gallagher. 2003. Double dissociation of the effects of lesions
of basolateral and central amygdala on conditioned stimulus-potentiated feeding and
Pavlovian-instrumental transfer. Eur. J. Neurosci. 17: 1680-1694.
105. Shiflett, M. W. & B. W. Balleine. 2010. At the limbic–motor interface:
disconnection of basolateral amygdala from nucleus accumbens core and shell reveals
dissociable components of incentive motivation. Eur. J. Neurosci. 32: 1735-1743.
106. Corbit, L. H. & P. H. Janak. 2007. Inactivation of the lateral but not medial dorsal
striatum eliminates the excitatory impact of Pavlovian stimuli on instrumental
responding. J Neurosci. 27: 13977-13981.
107. Murschall, A. & W. Hauber. 2006. Inactivation of the ventral tegmental area
abolished the general excitatory influence of Pavlovian cues on instrumental
performance. Learn. Memory. 13: 123-126.
108. Hall, J., et al. 2001. Involvement of the central nucleus of the and nucleus
accumbens core in mediating Pavlovian influences on instrumental behaviour. Eur. J.
Neurosci. 13: 1984-1992.
109. Chase, H. W., et al. 2008. The role of the orbitofrontal cortex in human
discrimination learning. Neuropsychologia. 46: 1326-1337.
110. Wilson, S. J., M. A. Sayette & J. A. Fiez. 2004. Prefrontal responses to drug cues:
a neurocognitive analysis. Nat. Neurosci. 7: 211-214.
111. Farré, M. & J. Camí. 1991. Pharmacokinetic considerations in abuse liability
evaluation. Addiction. 86: 1601-1606.
112. Wickens, J. R., et al. 2007. Dopaminergic mechanisms in actions and habits. J.
Neurosci. 27: 8181-8183.
113. Jedynak, J. P., et al. 2007. Methamphetamine-induced structural plasticity in the
dorsal striatum. Eur. J. Neurosci. 25: 847-853.
114. Cunha-Oliveira, T., A. C. Rego & C. R. Oliueira. 2008. Cellular and molecular
mechanisms involved in the neurotoxicity of opioid and psychostimulant drugs. Brain
Res. Rev. 58: 192-208.
115. Balleine, B. W. & J. P. O'Doherty. 2010. Human and rodent homologies in action
control: Corticostriatal determinants of goal-directed and habitual action.
Neuropsychopharmacology. 35: 48-69.
116. Jones, B. T. & R. M. Young. 2011. Changing alcohol expectancies and selfefficacy expectations. In Handbook of motivational counseling: Goal-based approaches
to assessment and intervention with addiction and other problems. Cox, W. M. & E.
Klinger, Eds.: 489-504. John Wiley & Sons, Ltd.
117. Bouton, M. E. 2002. Context, ambiguity, and unlearning: sources of relapse after
behavioral extinction. Biol Psychiatry. 52: 976-986.
118. Bickel, W. K., et al. 2011. Remember the future: Working memory training
decreases delay discounting among stimulant addicts. Biol Psychiatry. 69: 260-265.
119. Quick, S. L., et al. 2011. Loss of alternative non-drug reinforcement induces
relapse of cocaine-seeking in rats: Role of dopamine D1 receptors.
Neuropsychopharmacology. 36: 1015-1020.
29
30
Figure Legends
Figure 1: Experience of the drug outcome is separately encoded in terms of its perceptual
identity (Oi) and incentive value (Ov), and establishes learning about (1) the instrumental
contingency between drug-seeking response and the drug (R-Oiv); (2) the habitual
contingency between drug stimuli and the drug-seeking response (S-R); and (3) the
Pavlovian contingency between drug stimuli and the drug (S-Oiv). It is argued that
chronic drug exposure generates a progressive impairment in capacity to retrieve or
utilise the specific identity of outcomes (Oi), which causes a transition in behavioural
control from the R-Oiv and S-Oi associations to the S-R and S-Ov associations. That is,
addiction reflects a loss of control over behaviour by knowledge of the consequences
indexed by outcome-devaluation and specific transfer, in favour of control by antecedent
stimuli indexed by devaluation insensitivity and general transfer.
31
Download