Spike-timing-dependent plasticity (STDP) and its relation to differential Hebbian learning Overview over different methods M a c h in e L e a rn in g C la s s ic a l C o n d itio n in g A n tic ip a to r y C o n tr o l o f A c tio n s a n d P r e d ic tio n o f V a lu e s S y n a p tic P la s tic ity C o r r e la tio n o f S ig n a ls R E IN F O R C E M E N T L E A R N IN G U N -S U P E R V IS E D L E A R N IN G e x a m p le b a s e d c o r r e la tio n b a s e d D y n a m ic P ro g . (B e llm a n E q .) d -R u le H e b b -R u le s u p e r v is e d L . = R e s c o rla / Wagner LT P ( LT D = a n ti) = E lig ib ilit y T ra c e s T D (l ) o fte n l = 0 You are here ! T D (1 ) T D (0 ) D iffe re n tia l H e b b -R u le (”s lo w ”) = N e u r.T D - fo r m a lis m M o n te C a rlo C o n tro l S T D P -M o d e ls A c to r /C r itic IS O - L e a r n in g ( “ C r itic ” ) IS O - M o d e l of STDP SARSA B io p h y s . o f S y n . P la s tic ity C o r r e la tio n b a s e d C o n tr o l ( n o n - e v a lu a t iv e ) IS O -C ontrol STD P b io p h y s ic a l & n e tw o r k N e u r.T D - M o d e ls te c h n ic a l & B a s a l G a n g l. Q -L e a rn in g D iffe re n tia l H e b b -R u le (”fa s t”) D o p a m in e G lu ta m a te N e u ro n a l R e w a rd S y s te m s (B a s a l G a n g lia ) N O N -E VA L U AT IV E F E E D B A C K (C o rre la tio n s ) E VA L U AT IV E F E E D B A C K (R e w a rd s ) Differential Hebb Learning Rule d wi (t ) ui (t ) V’(t) y(t ) dt Simpler Notation x = Input u = Traced Input Early: “Bell” x Xi X0 Late: “Food” ui u0 w S V Defining the Trace In general there are many ways to do this, but usually one chooses a trace that looks biologically realistic and allows for some analytical calculations, too. n h( t ) = h k( t ) t õ 0 0 t< 0 EPSP-like functions: a-function: hk( t ) = teà at Dampened Sine wave: hk( t ) = Double exp.: h( t ) k = Shows an oscillation. 1 b sin( bt ) 1 à at î (e eà at а eà bt ) This one is most easy to handle analytically and, thus, often used. Differential Hebbian Learning d wi (t ) ui (t ) v' (t ) dt w Filtered Derivative of Input the Output T Output v(t ) wi (t ) ui (t ) Produces asymmetric weight change curve (if the filters h produce unimodal „humps“) Spike-timing-dependent plasticity (STDP): Some vague shape similarity Pre Post Synaptic change % tPre Pre tPre tPost Post Pre precedes Post: Long-term Potentiation tPost Pre follows Post: Long-term Depression T=tPost - tPre Weight-change curve (Bi&Poo, 2001) ms Hebbian learning When an axon of cell A excites cell B and repeatedly or persistently takes part in firing it, some growth processes or metabolic change takes place in one or both cells so that A‘s efficiency ... is increased. Donald Hebb (1949) A B t A B Conventional LTP Synaptic change % Pre Pre tPre Post Post tPre tPost tPost Symmetrical Weight-change curve The temporal order of input and output does not play any role The biophysical equivalent of Hebb’s postulate Plastic Synapse Presynaptic Signal (Glu) NMDA/AMPA Pre-Post Correlation, but why is this needed? Postsynaptic: Source of Depolarization Plasticity is mainly mediated by so called N-methyl-D-Aspartate (NMDA) channels. These channels respond to Glutamate as their transmitter and they are voltage depended: out in out in Biophysical Model: Structure x NMDA synapse v Source of depolarization: 1) Any other drive (AMPA or NMDA) 2) Back-propagating spike 3) Dendritic Spike Hence NMDA-synapses (channels) do require a (hebbian) correlation between pre and post-synaptic activity! Local Events at the Synapse x1 Current sources “under” the synapse: • Synaptic current SLocal • Currents from all parts of the dendritic tree u1 v • Influence of a Back-propagating or dendritic spike Isynaptic IBP IDendritic S Global Membrane potential: Vrest V (t ) d C V (t ) (wi wi ) gi (t )( Ei V ) I dep dt R i Synaptic input Weight Pre-syn. Spike On „Eligibility Traces“ gNMDA [nS] 0.1 0 Depolarization source gNMDA * 80 t [ms] 40 0.4 0.35 w S ISO-Learning 0.3 BP- or D-Spike X 0.25 0.2 0.15 V*h 0.1 0.05 0 0 2 4 6 8 10 x1 h x0 h v’ w1 S v w0 Model structure • Dendritic compartment • Plastic synapse with NMDA channels Source of Ca2+ influx and coincidence detector • Source of depolarization: 1. Back-propagating spike 2. Local dendritic spike Plastic Synapse NMDA/AMPA g NMDA/AMPA dV ~ g i (t )( Ei V ) I dep dt i Source of Depolarization BP spike Dendritic spike NMDA synapse Plastic synapse NMDA/AMPA g NMDA/AMPA dV ~ g i (t )( Ei V ) I dep dt i Plasticity Rule (Differential Hebb) Source of depolarization Instantenous weight change: d w(t ) cN (t ) F ' (t ) dt Presynaptic influence Glutamate effect on NMDA channels Postsynaptic influence NMDA synapse Plastic synapse d w(t ) cN (t ) F ' (t ) dt NMDA/AMPA g NMDA/AMPA dV ~ g i (t )( Ei V ) I dep dt i Source of depolarization Pre-synaptic influence Normalized NMDA conductance: gNMDA [nS] e t / 1 e t / 2 cN 1 [ Mg 2 ]eV 0.1 0 40 80 t [ms] NMDA channels are instrumental for LTP and LTD induction (Malenka and Nicoll, 1999; Dudek and Bear ,1992) Depolarizing potentials in the dendritic tree 20 V [mV] 0 -20 -40 -60 0 10 V [mV] 20 20 t [ms] 0 Dendritic spikes (Larkum et al., 2001 -20 Golding et al, 2002 -40 -60 0 20 10 20 t [ms] 20 t [ms] Häusser and Mel, 2003) V [mV] 0 -20 -40 -60 20 0 10 V [mV] 0 -20 (Stuart et al., 1997) -40 -60 0 Backpropagating spikes 10 20 t [ms] NMDA synapse Plastic synapse NMDA/AMPA d w(t ) cN (t ) F ' (t ) dt g NMDA/AMPA dV ~ g i (t )( Ei V ) I dep dt i Source of depolarization Postsyn. Influence Filtered Membrane potential = source of depolarization F( t ) = V( t ) г h( t ) Low-pass filter Filter h is adjusted to account for steep rise and long tail of the observed Calcium transients induced by back-propagating spikes and dendritic spikes (Markram et al., 1995; Wessel et al, 1999) The time course of the [Ca2+] concentration is important in defining the direction and degree of synaptic modifications. (Yang et al., 1999; Bi, 2002) Some Signals F 20 V [mV] 0 0V -20 [mV] -40 -20 0 -60 -40 0 10 20 t [ms] -60 V [mV] -20 -40 50 0 100 150 t [ms] 20 -60 V [mV] 0 0 20 40 60 80 t [ms] -20 0V -40 [mV] -60 -20 0 10 20 -40 -60 0 100 50 150 20 t [ms] t [ms] V [mV] 0 0 V [mV] -20 -20 -40 -40 20 V [mV] -60 0 0 -20 -40 -60 0 10 20 t [ms] -60 10 20 t [ms] 0 20 40 60 80 t [ms] Weight Change Curves Source of Depolarization: Back-Propagating Spikes Back-propagating spike NMDAr activation 0.01 20 w V [mV] 0 -0.01 -20 Back-propagating spike Weight change curve -40 -60 0 10 20 t [ms] -0.03 T -40 T=tPost – tPre 20 0.01 V [mV] -20 0 20 40 T [ms] -20 0 20 40 T [ms] w 0 -20 -0.01 -40 -60 0 10 20 t [ms] -0.03 -40 Weight Change Curves Source of Depolarization: Dendritic Spike Dendritic spike Weight change curve 0V 0.005 [mV] NMDAr activation -20 0 -40 -0.005 -60 Dendritic spike 50 0 100 150 t [ms] -0.01 -100 -50 T 0V T=tPost – tPre w 0.005 [mV] 0 50 100 T [ms] 0 50 100 T [ms] w -20 0 -40 -0.005 -60 0 50 100 150 t [ms] -0.01 -100 -50 Local Learning Rules The same learning rule: Hebbian learning for distal synapses -0.01 0 V [mV] -20 Correlations beyond -40 +-30 ms are mostly -60 random 0 20 40 60 80 w 0 -0.01 t [ms] -80 -40 0 40 T [ms] Differential Hebbian learning for proximal synapses -0.01 0 V [mV] -20 w 0 -40 -60 0 -0.01 20 40 60 80 t [ms] -80 -40 0 40 T [ms] Saudargiene et al Neural Comp. 2004 Biologically inspired Artificial Neural Network algorithm which implements local learning rules: Circuit Diagram Representation wn vn u xn hn un . . . x1 x0 v’n X h1 h0 u0 T v1 X u1 hnn v’1 h11 w1 w1 wn w0 S v T Site-specific learning using the same learning rule An example Application: developing velocity sensitivity Weight Weight 0.8 0.58 0.57 0.7 0.56 0.6 0.5 0.55 0.54 0.4 0.53 0.3 0.52 0.2 0 300 600 900 0.51 0.5 0 300 Group number 600 900 Group number Weight 0.8 0.6 0.4 0.2 0 STDP LTP 0.8 0.6 0.4 0.2 0 Weight number After learning the cell becomes sensitive to stimulus velocity Discriminant 80 40 0 2 3 4 5 6 7 8 9 10 11 12 80 40 0 2 3 4 5 6 7 8 9 10 11 12 Cell fires Cell does not fire 1/vel. Figure 1. Dendritic Excitability Creates a Switchable, Spatial Gradient of Plasticity in L5 Pyramids (Left) Short bursts of somatic spikes elicit bAPs that fail to backpropagate fully to distal dendrites. The result, as shown in Sjöström and Häusser (2006), is LTP of synchronously active proximal synapses, but LTD or no plasticity at distal synapses. (Middle) Cooperative activation of additional synapses depolarizes the dendrite and boosts bAP propagation into distal dendrites. This cooperativity serves as a switch to enable distal LTP (Sjöström and Häusser, 2006). (Right) Plasticity also varies with firing mode of these neurons: when bAPs are coupled with strong distal input, bAP-activated calcium spikes (BACs) are evoked in the apical tuft, which enables robust LTP (Kampa et al., 2006). On Ca2+ Differential threshold hypothesis (Artola and Singer, 1993; Lisman 1989) LTD: low intrinsic [Ca2+] threshold 2+ LTP: higher intrinsic [Ca ] threshold Problems: STDP post before pre: pre before post: Mg2+ is already removed NMDAR is already open NMDAR opens (a slow process!) Mg2+ is then removed little Ca2+ influx much Ca2+ influx pre long before post: NMDAR starts to close Mg2+ is then removed little Ca2+ influx But no late LTD window found ?? Ca-Gradient STDP post before pre: pre before post: Mg2+ is already removed NMDAR is already open NMDAR opens (a slow process!) Mg2+ is then removed SLOW Ca2+ influx FAST Ca2+ influx pre long before post: NMDAR starts to close Mg2+ is then removed FAST (but little) Ca2+ influx no late LTD window found Some more physiological complications ! Modeling Ca2+ pathways I NMDA + Back-propagating spike Ca2+ concentration and gradient Calmodulin Ca2+/CaM Kinase II Phosporylation AMPA receptors Synapse gets stronger=LTP Calcineurin We are modeling Ca2+ rise and fall with just one diff. hebb rule. More detailed models look at LTP and LTD as two different processes. Dephosporylation Synapse gets weaker-LTD Temporally local Learning Self-Influencing Plasticity Hebbian Learning Equivalent Circuit Diagram BP-Spike Only LTD dominates T zero-crossing DS-Spike Only BP before DS = + Acausal DS before BP = Causal LTP dominates LTP weakly dominates BP Spike: Before LTD dominates LTD weakly dominates Displacement BP vs DS spike At sameTime After the DS-spike LTP dominates } DS- and BP-Spike Somatic Firing Threshold passed } } Cluster 2 Cluster 1 Local DS-Spike only weak hebbian learning pronounced STDP Why might this make sense ?? Single phase learning will lead to weight growth regardless Calcium protocols From the viewpoint of the cell these two peaks are almost of equal quality (height and rise phase). Hence chances for LTP and LTD are equal. Solution? long-lasting low frequency stimulation short burst-like high frequency stimulation