004 Vorl06diff cont01a STDP

advertisement
Spike-timing-dependent
plasticity
(STDP)
and its relation to differential
Hebbian learning
Overview over different methods
M a c h in e L e a rn in g
C la s s ic a l C o n d itio n in g
A n tic ip a to r y C o n tr o l o f A c tio n s a n d P r e d ic tio n o f V a lu e s
S y n a p tic P la s tic ity
C o r r e la tio n o f S ig n a ls
R E IN F O R C E M E N T L E A R N IN G
U N -S U P E R V IS E D L E A R N IN G
e x a m p le b a s e d
c o r r e la tio n b a s e d
D y n a m ic P ro g .
(B e llm a n E q .)
d -R u le
H e b b -R u le
s u p e r v is e d L .
=
R e s c o rla /
Wagner
LT P
( LT D = a n ti)
=
E lig ib ilit y T ra c e s
T D (l )
o fte n l = 0
You are here !
T D (1 )
T D (0 )
D iffe re n tia l
H e b b -R u le
(”s lo w ”)
= N e u r.T D - fo r m a lis m
M o n te C a rlo
C o n tro l
S T D P -M o d e ls
A c to r /C r itic
IS O - L e a r n in g
( “ C r itic ” )
IS O - M o d e l
of STDP
SARSA
B io p h y s . o f S y n . P la s tic ity
C o r r e la tio n
b a s e d C o n tr o l
( n o n - e v a lu a t iv e )
IS O -C ontrol
STD P
b io p h y s ic a l & n e tw o r k
N e u r.T D - M o d e ls
te c h n ic a l & B a s a l G a n g l.
Q -L e a rn in g
D iffe re n tia l
H e b b -R u le
(”fa s t”)
D o p a m in e
G lu ta m a te
N e u ro n a l R e w a rd S y s te m s
(B a s a l G a n g lia )
N O N -E VA L U AT IV E F E E D B A C K (C o rre la tio n s )
E VA L U AT IV E F E E D B A C K (R e w a rd s )
Differential Hebb Learning Rule
d
wi (t )   ui (t ) V’(t)
y(t )
dt
Simpler Notation
x = Input
u = Traced Input
Early: “Bell”
x
Xi
X0
Late: “Food”
ui
u0
w
S
V
Defining the Trace
In general there are many ways to do this, but usually one
chooses a trace that looks biologically realistic and allows
for some analytical calculations, too.
n
h( t ) =
h k( t ) t õ 0
0
t< 0
EPSP-like functions:
a-function:
hk( t ) = teà at
Dampened
Sine wave:
hk( t ) =
Double exp.: h( t )
k
=
Shows an oscillation.
1
b sin( bt )
1 à at
î (e
eà at
а eà bt )
This one is most easy to handle analytically and, thus, often used.
Differential Hebbian Learning
d
wi (t )   ui (t ) v' (t )
dt
w
Filtered Derivative of
Input
the Output
T
Output
v(t )  wi (t ) ui (t )
Produces asymmetric weight change curve
(if the filters h produce unimodal „humps“)
Spike-timing-dependent plasticity
(STDP): Some vague shape similarity
Pre
Post
Synaptic change %
tPre
Pre
tPre
tPost
Post
Pre precedes Post:
Long-term
Potentiation
tPost
Pre follows Post:
Long-term
Depression
T=tPost - tPre
Weight-change curve
(Bi&Poo, 2001)
ms
Hebbian learning
When an axon of cell A excites cell B and
repeatedly or persistently takes part in firing it,
some growth processes or metabolic change takes
place in one or both cells so that A‘s efficiency ... is
increased.
Donald Hebb (1949)
A
B
t
A
B
Conventional LTP
Synaptic change %
Pre
Pre
tPre
Post
Post
tPre
tPost
tPost
Symmetrical Weight-change curve
The temporal order of input and output does not play any role
The biophysical equivalent of
Hebb’s postulate
Plastic
Synapse
Presynaptic Signal
(Glu)
NMDA/AMPA
Pre-Post Correlation,
but why is this needed?
Postsynaptic:
Source of
Depolarization
Plasticity is mainly mediated by so called
N-methyl-D-Aspartate (NMDA) channels.
These channels respond to Glutamate as their transmitter and
they are voltage depended:
out
in
out
in
Biophysical Model: Structure
x
NMDA
synapse
v
Source of depolarization:
1) Any other drive (AMPA or NMDA)
2) Back-propagating spike
3) Dendritic Spike
Hence NMDA-synapses (channels) do require a (hebbian) correlation
between pre and post-synaptic activity!
Local Events at the Synapse
x1
Current sources “under” the synapse:
• Synaptic current
SLocal
• Currents from all parts of the dendritic tree
u1
v
• Influence of a Back-propagating or
dendritic spike
Isynaptic
IBP
IDendritic
S
Global
Membrane potential:
Vrest  V (t )
d
C V (t )   (wi  wi ) gi (t )( Ei  V ) 
 I dep
dt
R
i
Synaptic
input
Weight
Pre-syn. Spike
On „Eligibility Traces“
gNMDA [nS]
0.1
0
Depolarization
source
gNMDA
*
80 t [ms]
40
0.4
0.35
w
S
ISO-Learning
0.3
BP- or D-Spike
X
0.25
0.2
0.15
V*h
0.1
0.05
0
0
2
4
6
8
10
x1
h
x0
h
v’
w1 S
v
w0
Model structure
• Dendritic compartment
• Plastic synapse with NMDA channels
Source of Ca2+ influx and coincidence detector
• Source of depolarization:
1. Back-propagating spike
2. Local dendritic spike
Plastic
Synapse
NMDA/AMPA
g NMDA/AMPA
dV
~  g i (t )( Ei  V )  I dep
dt
i
Source of
Depolarization
BP spike
Dendritic spike
NMDA synapse Plastic synapse
NMDA/AMPA
g NMDA/AMPA
dV
~  g i (t )( Ei  V )  I dep
dt
i
Plasticity Rule
(Differential Hebb)
Source of
depolarization
Instantenous weight change:
d
w(t )   cN (t ) F ' (t )
dt
Presynaptic influence
Glutamate effect on
NMDA channels
Postsynaptic
influence
NMDA synapse Plastic synapse
d
w(t )   cN (t ) F ' (t )
dt
NMDA/AMPA
g NMDA/AMPA
dV
~  g i (t )( Ei  V )  I dep
dt
i
Source of
depolarization
Pre-synaptic influence
Normalized NMDA conductance:
gNMDA [nS]
e t / 1  e t / 2
cN 
1  [ Mg 2 ]eV
0.1
0
40
80 t [ms]
NMDA channels are instrumental for LTP and LTD induction
(Malenka and Nicoll, 1999; Dudek and Bear ,1992)
Depolarizing potentials in the dendritic tree
20
V [mV]
0
-20
-40
-60
0
10
V [mV]
20
20
t [ms]
0
Dendritic
spikes
(Larkum et al., 2001
-20
Golding et al, 2002
-40
-60
0
20
10
20
t [ms]
20
t [ms]
Häusser and Mel, 2003)
V [mV]
0
-20
-40
-60
20
0
10
V [mV]
0
-20
(Stuart et al., 1997)
-40
-60
0
Backpropagating
spikes
10
20
t [ms]
NMDA synapse Plastic synapse
NMDA/AMPA
d
w(t )   cN (t ) F ' (t )
dt
g NMDA/AMPA
dV
~  g i (t )( Ei  V )  I dep
dt
i
Source of
depolarization
Postsyn. Influence
Filtered Membrane potential =
source of depolarization
F( t ) = V( t ) г h( t )
Low-pass filter
Filter h is adjusted to account for steep rise and long tail of the observed Calcium transients
induced by back-propagating spikes and dendritic spikes (Markram et al., 1995; Wessel et al,
1999)
The time course of the [Ca2+] concentration is important in defining the
direction and degree of synaptic modifications. (Yang et al., 1999; Bi, 2002)
Some Signals F
20
V [mV]
0
0V
-20
[mV]
-40
-20
0
-60
-40
0
10
20
t [ms]
-60
V
[mV]
-20
-40
50
0
100
150
t [ms]
20
-60
V [mV]
0
0
20
40
60
80
t [ms]
-20
0V
-40
[mV]
-60
-20
0
10
20
-40
-60
0
100
50
150
20
t [ms]
t [ms]
V [mV]
0
0
V
[mV]
-20
-20
-40
-40
20
V [mV]
-60
0
0
-20
-40
-60
0
10
20
t [ms]
-60
10
20
t [ms]
0
20
40
60
80
t [ms]
Weight Change Curves
Source of Depolarization: Back-Propagating Spikes
Back-propagating
spike
NMDAr activation
0.01
20
w
V [mV]
0
-0.01
-20
Back-propagating
spike
Weight change curve
-40
-60
0
10
20
t [ms]
-0.03
T
-40
T=tPost – tPre
20
0.01
V [mV]
-20
0
20 40 T [ms]
-20
0
20 40 T [ms]
w
0
-20
-0.01
-40
-60
0
10
20
t [ms]
-0.03
-40
Weight Change Curves
Source of Depolarization: Dendritic Spike
Dendritic spike
Weight change curve
0V
0.005
[mV]
NMDAr activation
-20
0
-40
-0.005
-60
Dendritic
spike
50
0
100
150
t [ms]
-0.01
-100 -50
T
0V
T=tPost – tPre
w
0.005
[mV]
0
50 100 T [ms]
0
50 100 T [ms]
w
-20
0
-40
-0.005
-60
0
50
100
150
t [ms]
-0.01
-100 -50
Local Learning Rules
The same learning rule:
Hebbian learning for distal synapses
-0.01
0
V
[mV]
-20
Correlations beyond
-40
+-30 ms are mostly
-60
random
0
20
40
60
80
w
0
-0.01
t [ms]
-80
-40
0
40 T [ms]
Differential Hebbian learning for proximal synapses
-0.01
0
V
[mV]
-20
w
0
-40
-60
0
-0.01
20
40
60
80
t [ms]
-80
-40
0
40 T [ms]
Saudargiene et al Neural Comp. 2004
Biologically inspired Artificial Neural Network algorithm
which implements local learning rules:
Circuit Diagram Representation
wn
vn
u
xn
hn
un
.
.
.
x1
x0
v’n
X
h1
h0
u0
T
v1
X
u1
hnn
v’1
h11
w1
w1 wn
w0 S
v
T
Site-specific learning using
the same learning rule
An example Application: developing velocity sensitivity
Weight
Weight
0.8
0.58
0.57
0.7
0.56
0.6
0.5
0.55
0.54
0.4
0.53
0.3
0.52
0.2
0
300
600
900
0.51
0.5
0
300
Group number
600
900
Group number
Weight
0.8
0.6
0.4
0.2
0
STDP
LTP
0.8
0.6
0.4
0.2
0
Weight number
After learning the cell
becomes sensitive to
stimulus velocity
Discriminant
80
40
0
2 3 4 5 6 7 8 9 10 11 12
80
40
0
2 3 4 5 6 7 8 9 10 11 12
Cell fires Cell does not fire
1/vel.
Figure 1. Dendritic Excitability Creates a Switchable, Spatial Gradient of
Plasticity in L5 Pyramids
(Left) Short bursts of somatic spikes elicit bAPs that fail to backpropagate fully to
distal dendrites. The result, as shown in Sjöström and Häusser (2006), is LTP of
synchronously active proximal synapses, but LTD or no plasticity at distal
synapses. (Middle) Cooperative activation of additional synapses depolarizes the
dendrite and boosts bAP propagation into distal dendrites. This cooperativity
serves as a switch to enable distal LTP (Sjöström and Häusser, 2006). (Right)
Plasticity also varies with firing mode of these neurons: when bAPs are coupled
with strong distal input, bAP-activated calcium spikes (BACs) are evoked in the
apical tuft, which enables robust LTP (Kampa et al., 2006).
On Ca2+
Differential threshold hypothesis
(Artola and Singer, 1993; Lisman 1989)
LTD: low intrinsic [Ca2+] threshold
2+
LTP: higher intrinsic [Ca ] threshold
Problems:
STDP
post before pre:
pre before post:
Mg2+ is already removed
NMDAR is already open
NMDAR opens (a slow
process!)
Mg2+ is then removed
little Ca2+ influx
much Ca2+ influx
pre long before post:
NMDAR starts to close
Mg2+ is then removed
little Ca2+ influx
But no late LTD window
found ??
Ca-Gradient
STDP
post before pre:
pre before post:
Mg2+ is already removed
NMDAR is already open
NMDAR opens (a slow
process!)
Mg2+ is then removed
SLOW Ca2+ influx
FAST Ca2+ influx
pre long before post:
NMDAR starts to close
Mg2+ is then removed
FAST (but little) Ca2+ influx
no late LTD window found
Some more physiological complications !
Modeling Ca2+ pathways
I NMDA + Back-propagating spike
Ca2+ concentration and gradient
Calmodulin
Ca2+/CaM Kinase II
Phosporylation
AMPA receptors
Synapse gets stronger=LTP
Calcineurin
We are modeling
Ca2+ rise and fall
with just one diff.
hebb rule. More
detailed models look
at LTP and LTD as
two different
processes.
Dephosporylation
Synapse gets weaker-LTD
Temporally local
Learning
Self-Influencing Plasticity
Hebbian Learning
Equivalent Circuit Diagram
BP-Spike Only
LTD dominates
T zero-crossing
DS-Spike Only
BP before DS =
+ Acausal
DS before BP =
Causal
LTP dominates
LTP weakly dominates
BP Spike:
Before
LTD dominates
LTD weakly dominates
Displacement BP vs DS spike
At sameTime
After
the DS-spike
LTP dominates
}
DS- and BP-Spike
Somatic Firing
Threshold passed
}
}
Cluster 2
Cluster 1
Local DS-Spike only
weak hebbian learning
pronounced STDP
Why might this make sense ??
Single phase learning will lead to weight growth regardless
Calcium protocols
From the viewpoint of the
cell these two peaks are
almost of equal quality
(height and rise phase).
Hence chances for LTP
and LTD are equal.
Solution?
long-lasting low
frequency stimulation
short burst-like high
frequency stimulation
Download