Lecture 6 - TeachLine

Decision making in basketball • 2-point shot: easier, fewer points • 3-point shot: more difficult, more points Kobe Bryant LA Lakers 31.6 PPG (2006-7) 1 Chris Bosh Toronto Raptors 26.3 PPG (2006-7) 3P attempts: 398 2P attempts: 1,359 (77%) 1,059 (97%) 3P success: 34% 34% 2P success : 50% 50% (23%) 35 (3%) The matching law NBA best 100 players (2006-2007) 1 N3 N 2  N3 0.8 0.6 0.4 Bryant 0.2 0 0 2 N2,3 = # of 2,3 points shots I2,3 = # 2,3 points earned Bosh 0.2 0.4 0.6 0.8 I3 I 2  I3 1 The reward schedule R t   r  At  , At 1 , At  2 3 , The matching law 1 N1 N1  N 2 1 Herrnstein, JEAB, 1961 4 I1 I1  I 2 The matching law Sugrue, Corrado & Newsome, Science, 2004 5 The matching law Gallistel et al., unpublished 6 The matching law Nj = # of attempts at alternative j  investment in j Ij = # of points earned from alternative j  income from j N1 I1  N1  N 2 I1  I 2 I1 I2  N1 N 2 equal returns E  R A  1  E  R A  2  7 The matching law is very general. It is found in many animal types as well as humans, under very different experimental conditions. MATCHING  MAXIMIZING 8 Example: addiction model E[R|A=drugs] 0 9 0.2 0.4 0.6 freq [drugs] 1–freq [work] after Herrnstein and Prelec, J Econ Perspect, 1991 0.8 1 Example: addiction model E[R|A=drugs] E[R|A=work] matching 0 10 0.2 0.4 0.6 freq [drugs] 1–freq [work] after Herrnstein and Prelec, J Econ Perspect, 1991 0.8 1 Example: addiction model E[R|A=drugs] E[R|A=work] E[R] maximizing 0 11 0.2 matching 0.4 0.6 freq [drugs] 1–freq [work] after Herrnstein and Prelec, J Econ Perspect, 1991 0.8 1 Question: What is the neural basis of the matching law? 12 It is generally believed that learning is due to changes in the efficacy of synapses 0.4 μm 13 Kennedy, Science, 2000 Question: What is the neural basis of the matching law? Question: What microscopic plasticity rules underlie adaptation to matching behavior? 14 Question: What is the neural basis of the matching law? Hypothesis: the matching law results from synaptic plasticity that is driven by the covariance of reward and neural activity 15 Question: What is the neural basis of the matching law? Hypothesis: the matching law results from synaptic plasticity that is driven by the covariance of reward and neural activity 16 Covariance is a measure of dependence • two random variables X, Y  X  X  E[ X ]  Y  Y  E[Y ] • covariance: Cov  X , Y   E[ X   Y ]  E[ X   Y ]  E[ X  Y ] • correlation coefficient: r  17 Cov[ X , Y ] Var[ X ]Var[Y ] Covariance Cov[X,Y ]  0 18  Cov[X,Y ]  0  Cov[X,Y ]  0 Hypothesis: the matching law results from synaptic plasticity that is driven by the covariance of reward and neural activity 19 Synaptic plasticity • Local signals affect synaptic efficacies. Popular theory: Hebbain plasticity W  S pre S post • Global signals affect synaptic efficacies. Popular theory: dopamine gates Hebbian plasticity (Wickens) W  DS pre S post 20 21 Schultz, Dayan & Montague, Science, 1997 Synaptic plasticity • Local signals affect synaptic efficacies. Popular theory: Hebbain plasticity W  S pre S post • Global signals affect synaptic efficacies. Popular theory: dopamine gates Hebbian plasticity (Wickens) W  DS pre S post • Popular theory: dopamine codes the mismatch between reward and expected reward (Schultz) 22 D  R  E  R   R Synaptic plasticity W  DS pre S post D R W   R  S pre S post Average trajectory approximation E  W   E[ R  S pre S post ]  Cov[ R,  S pre S post ] 23 Covariance-based plasticity rules W   R  E  R   N W  R   N  E  N  N=Spre , N=Spost , N=SpreSpost Average trajectory approximation: E  W   Cov[ R, N ] 24 Hypothesis: covariance-based synaptic plasticity  The matching law outline: Stationary state of covariance-based plasticity  Cov R, N   0  25 The matching law Assumptions neurons N1 N2 N3 N5 action reward A R N4 hidden variables 1. E[N|A=i] ≠ E[N|A≠i] 2. The dependence of the reward R on neural activity N is through the action A. 26 Theorem Suppose that Assumptions 1 and 2 are satisfied j  Cov  R, N   0  E  R | A  1  E  R | A  2 The matching law 27 Intuition Cov R, N   0  E  R | A  i   E  R • • 28 neuron action reward N A R In general R depends on A If, as a result of the policy used by the subject, R becomes independent of A then R also becomes independent of N W   R  S  E  S   29 W   R  S  E  S   W   RS 30 W   R  S  E  S   W   RS W    R  E  R  S ij 31 i pre M j post Summary Hypothesis: Covariance based synaptic plasticity underlies the matching law Theorem: Cov R, N   0  The matching law Loewenstein & Seung, PNAS, 2006 Loewenstein, PLoS Comp Biol, 2008 Disclaimer: There are learning rules that converge to 32 Cov[R,N]=0 that are not driven by covariance

Lecture 6 - TeachLine

Related documents

Products

Support

Lecture 6 - TeachLine

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib