Part 7: Neural Networks & Learning 11/28/07 Spatial Effects • Previous simulation assumes that each agent is equally likely to interact with each other • So strategy interactions are proportional to fractions in population • More realistically, interactions with “neighbors” are more likely Lecture 27 – “Neighbor” can be defined in many ways • Neighbors are more likely to use the same strategy 11/28/07 1 11/28/07 2 Spatial Simulation • Toroidal grid • Agent interacts only with eight neighbors • Agent adopts strategy of most successful neighbor • Ties favor current strategy NetLogo Simulation of Spatial IPD Run SIPD.nlogo 11/28/07 3 11/28/07 Typical Simulation (t = 1) 11/28/07 4 Typical Simulation (t = 5) Colors: Colors: ALL-C TFT RAND PAV ALL-D ALL-C TFT RAND PAV ALL-D 5 11/28/07 6 1 Part 7: Neural Networks & Learning 11/28/07 Typical Simulation (t = 10) Zooming In Typical Simulation (t = 10) Colors: ALL-C TFT RAND PAV ALL-D 11/28/07 7 11/28/07 Typical Simulation (t = 20) Typical Simulation (t = 50) Colors: Colors: ALL-C TFT RAND PAV ALL-D ALL-C TFT RAND PAV ALL-D 11/28/07 9 11/28/07 Typical Simulation (t = 50) Zoom In 11/28/07 8 10 SIPD Without Noise 11 11/28/07 12 2 Part 7: Neural Networks & Learning 11/28/07 Additional Bibliography Conclusions: Spatial IPD 1. • Small clusters of cooperators can exist in hostile environment • Parasitic agents can exist only in limited numbers • Stability of cooperation depends on expectation of future interaction • Adaptive cooperation/defection beats unilateral cooperation or defection 11/28/07 2. 3. 4. 5. 13 von Neumann, J., & Morgenstern, O. Theory of Games and Economic Behavior, Princeton, 1944. Morgenstern, O. “Game Theory,” in Dictionary of the History of Ideas, Charles Scribners, 1973, vol. 2, pp. 263-75. Axelrod, R. The Evolution of Cooperation. Basic Books, 1984. Axelrod, R., & Dion, D. “The Further Evolution of Cooperation,” Science 242 (1988): 1385-90. Poundstone, W. Prisoner’s Dilemma. Doubleday, 1992. 11/28/07 Part VII 14 Supervised Learning • Produce desired outputs for training inputs • Generalize reasonably & appropriately to other inputs • Good example: pattern recognition • Feedforward multilayer networks VII. Neural Networks and Learning 11/28/07 15 11/28/07 16 Typical Artificial Neuron Feedforward Network input layer 11/28/07 hidden layers ... ... ... ... ... ... connection weights inputs output layer output threshold 17 11/28/07 18 3 Part 7: Neural Networks & Learning 11/28/07 Typical Artificial Neuron linear combination Equations activation function # n & hi = %% " w ij s j (( ) * $ j=1 ' h = Ws ) * Net input: net input (local field) s"i = # ( hi ) Neuron output: s" = # (h) ! 11/28/07 19 11/28/07 20 ! Treating Threshold as Weight Treating Threshold as Weight # & h = %%" w j x j (( ) * $ j=1 ' #n & h = %%" w j x j (( ) * $ j=1 ' n x1 x1 n = )* + " w j x j w1 xj x0 = –1 j=1 wn xn h Σ wj n w1 Θ y xj wn ! xn θ j=1 Σ wj = )* + " w j x j θ = w0 h Θ y ! Let x 0 = "1 and w 0 = # n n ˜ # x˜ h = w 0 x 0 + " w j x j =" w j x j = w j=1 11/28/07 21 11/28/07 j= 0 ! 22 ! Credit Assignment Problem input layer 11/28/07 ... ... ... ... ... hidden layers ... ... How do we adjust the weights of the hidden layers? NetLogo Demonstration of Back-Propagation Learning Desired output Run Artificial Neural Net.nlogo output layer 23 11/28/07 24 4 Part 7: Neural Networks & Learning 11/28/07 Adaptive System Gradient Evaluation Function (Fitness, Figure of Merit) System F S "F measures how F is altered by variation of Pk "Pk $ #F ' & #P1 ) & M ) & ) "F = & #F #P ) k & M ) &#F ) & ) % #Pm ( ! Control Algorithm P1 … Pk … Pm Control Parameters "F points in direction of maximum increase in F C 11/28/07 25 11/28/07 ! 26 ! Gradient Ascent on Fitness Surface Gradient Ascent by Discrete Steps ∇F ∇F + gradient ascent + – 11/28/07 – 27 11/28/07 Gradient Ascent is Local But Not Shortest 28 Gradient Ascent Process P˙ = "#F (P) Change in fitness : m "F d P m dF k F˙ = =# = # ($F ) k P˙k k=1 "P d t k=1 !d t k F˙ = $F % P˙ + – 2 F˙ = "F # $"F = $ "F % 0 ! 11/28/07 29 ! Therefore gradient ascent increases fitness (until reaches 0 gradient) 11/28/07 30 5 Part 7: Neural Networks & Learning 11/28/07 General Ascent on Fitness Surface General Ascent in Fitness Note that any adaptive process P( t ) will increase fitness provided : 0 < F˙ = "F # P˙ = "F P˙ cos$ ∇F + where $ is angle between "F and P˙ – Hence we need cos " > 0 ! or " < 90 o 11/28/07 31 11/28/07 32 ! Gradient of Fitness Fitness as Minimum Error % ( "F = "'' #$ E q ** = #$ "E q & q ) q Suppose for Q different inputs we have target outputs t1,K,t Q Suppose for parameters P the corresponding actual outputs ! are y1,K,y Q Suppose D(t,y ) " [0,#) measures difference between ! ! target & actual outputs Q d D(t q ,y q ) #y q " d yq #Pk q q q = " D t ,y # $y = Let E q = D(t q ,y q ) be error on qth sample ! "D(t q ,y q ) "y qj "E q " = D(t q ,y q ) = # "Pk "Pk "y qj "Pk j ! Q ! Let F (P) = "# E q (P) = "# D[t q ,y q (P)] ! q=1 yq q=1 11/28/07 33 11/28/07 ! 34 ! 2 Since ("E ! q ) k ! q q #y q #D(t ,y ) #E q = =$ j , q #Pk #y j j #Pk ! T i i "D(t # y ) " " (t # y ) = $ (t i # y i )2 = $ i"y i "y j "y j i j i = "#E q = ( J q ) #D(t q ,y q ) 11/28/07 2 i Note J q " # n$m and %D(t q ,y q ) " # n$1 ! $Pk Derivative of Squared Euclidean Distance Suppose D(t,y ) = t " y = # ( t " y ) #"y1q & "y1q % "P1 L "Pm ( Define Jacobian matrix J q = % M O M ( %"y q ( "y nq % n "P L ( " P 1 m $ ' ! ) ! Jacobian Matrix ! ( d( t j " y j ) dyj d D(t,y ) = 2( y # t ) dy 11/28/07 ! 2 2 = "2( t j " y j ) " 35 36 ! 6 Part 7: Neural Networks & Learning 11/28/07 Recap Gradient of Error on qth Input "E q d D(t ,y = "Pk d yq q q ) # "y = 2( y q $ t q ) # T P˙ = "$ ( J q ) (t q # y q ) q q "Pk To know how to decrease the differences between "y q "Pk actual & desired outputs, ! we need to know elements of Jacobian, "y qj "Pk , which says how jth output varies with kth parameter (given the qth input) "y q = 2% ( y qj $ t qj ) j j "Pk T "E q = 2( J q ) ( y q # t q ) The Jacobian depends on the specific form of the system, in this case, a feedforward neural network ! 11/28/07 37 ! 11/28/07 38 ! 7