Pattern Recognition using Hebbian Learning and Floating-Gates Certain pattern recognition problems have been shown to be easily solved by Artificial neural networks and many neural network chips have been made and sold. Most of these are not terribly biologically realistic. Output layer neurons weights Hidden layer neurons weights Input layer neurons A 2-dimensional example… x (-10:10) y(-10:10) w1 = 0.3, w2 = =0.7 x y Output Decision Space 50 50 100 100 Y Y Summation of weighted inputs 150 +1, -1 150 200 200 50 100 X 150 200 50 100 X 150 200 x (-10:10) y(-10:10) w1 = 0.5, w2 = 0.11 x y Outputs over the Input Space 50 50 100 100 Y Y Weighted Summation over the Input Space 150 +1, -1 150 200 200 50 100 X 150 200 50 100 X 150 200 Putting the two together… x y 50 -.53 y 0.53 -0.25 We respond to a smaller region of this 2-D input space. +1, -1 100 150 200 50 100 x 150 200 So in general, we can apply this type of operation on an N-dimensional input With the hidden units defining hyperplanes in this input space. The individual output units combine these hyperplanes to create specific subregions of this N-dim space. This is what pattern recognition is about. 100 x 77 pixels = 7700 dimensional input space As you might expect, these two images live very far apart from each other in this very high dimensional space. 10 20 30 40 50 60 70 unit1 80 90 Easy Task 100 10 20 30 40 50 60 70 unit2 10 20 30 But if we had a set of 100 faces that we wanted to recognize, this might be harder. 40 50 60 70 80 90 100 10 20 30 40 50 60 70 What happens if the faces are rotated, shifted, or scaled? How do I pick the weight matrices to solve these tasks?? One way is to present inputs and adjust the weights if the output is not what we want. wiknew wikold wik where input k, example wik 2 i k = 0 Learning rate if i Oi otherwise Output unit i, example Target output for unit i example /- 1 This is known as the perceptron learning rule A training set of examples with target output values is defined and presented one by one, adjusting the weight matrix after each evaluation. The learning rule Assigns large value weights to components of the inputs that allow discrimination between the classes of inputs. e.g., many faces and many helicopters Face vs. Helicopter Example 5000 Red = unit 1 response to face 4000 Blue = unit 1 response to helicopter 3000 Green = unit 2 response to face 2000 Black = unit 2 response to helicopter 1000 0 -1000 -2000 -3000 -4000 -5000 0 10 20 30 40 50 60 70 80 90 100 Associative Memory and Energy Functions Si inputs 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 100 100 10 wij matrix 20 30 40 50 60 70 10 20 30 40 50 60 70 The Hopfield Recurrent Network The concept of an energy function of a recurrent neural network was introduced by Hopfield (1982) to describe the state of a network. By studying the dynamics, it is possible to show that the activity in the network will always decrease in energy, evolving towards a "local minima". H - 1 wijSiSj 2 ij The network defines an 'energy landscape' in which the state of the network settles. By starting close to minima (stored patterns) compared to other points in the landscape The network will settle towards the minima and 'recall' the stored patterns. This view of neural processing has its merits, provides insight into this type of computational structure and has spawned new fields on its own, but does not describe the current neurobiological state of knowledge very well. In particular, neurons communicate with spikes and the backpropagation learning rule Is not a good match to what has been found. So what do we know about neurobiological learning? Hebbian learning If both cells are active, strengthen the synapse If only the post-synaptic cell is active, weaken the synapse In fact, learning at some synapses seems to be even more specific. Temporal ordering seems to play a role in determining the change in the synapse. strengthen weaken Dw Time between pre-syn and post-syn spikes Abbott and Blum, 1996 Chip Idea: 1. Design a spiking neural network that can learn using the spike-timing rule to solve a particular temporal pattern recognition problem 2. Design a floating-gate modification circuit that can implement the learning rule