Unsupervised Learning So far, we have only looked at supervised learning, in which an external teacher improves network performance by comparing desired and actual outputs and modifying the synaptic weights accordingly. However, most of the learning that takes place in our brains is completely unsupervised. This type of learning is aimed at achieving the most efficient representation of the input space, regardless of any output space. Unsupervised learning can also be useful in artificial neural networks. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning Applications of unsupervised learning include • Clustering • Vector quantization • Data compression • Feature extraction Unsupervised learning methods can also be combined with supervised ones to enable learning through inputoutput pairs like in the BPN. One such hybrid approach is the counterpropagation network. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 2 Unsupervised/Supervised Learning: The Counterpropagation Network The counterpropagation network (CPN) is a fastlearning combination of unsupervised and supervised learning. Although this network uses linear neurons, it can learn nonlinear functions by means of a hidden layer of competitive units. Moreover, the network is able to learn a function and its inverse at the same time. However, to simplify things, we will only consider the feedforward mechanism of the CPN. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 3 Distance/Similarity Functions In the hidden layer, the neuron whose weight vector is most similar to the current input vector is the “winner.” There are different ways of defining such maximal similarity, for example: (1) Maximal cosine similarity (same as net input): s(w, x) w x (2) Minimal Euclidean distance: d (w, x) wi xi 2 i (no square root necessary for determining the winner) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 4 The Counterpropagation Network A simple CPN with two input neurons, three hidden neurons, and two output neurons can be described as follows: O w21 O 13 Y1 w O 11 w O w12 H1 H 12 w Y2 w O w22 H w21 H2 H w31 H3 H w22 H w32 H 11 w X1 November 9, 2010 Output layer O 23 X2 Neural Networks Lecture 16: Counterpropagation Hidden layer Input layer 5 The Counterpropagation Network The CPN learning process (general form for n input units and m output units): 1. Randomly select a vector pair (x, y) from the training set. 2. If you use the cosine similarity function, normalize (shrink/expand to “length” 1) the input vector x by dividing every component of x by the magnitude ||x||, where || x || n 2 x j j 1 November 9, 2010 Neural Networks Lecture 16: Counterpropagation 6 The Counterpropagation Network 3. Initialize the input neurons with the resulting vector and compute the activation of the hidden-layer units according to the chosen similarity measure. 4. In the hidden (competitive) layer, determine the unit W with the largest activation (the winner). 5. Adjust the connection weights between W and all N input-layer units according to the formula: w (t 1) w (t ) ( xn w (t )) H Wn H Wn H Wn 6. Repeat steps 1 to 5 until all training patterns have been processed once. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 7 The Counterpropagation Network 7. Repeat step 6 until each input pattern is consistently associated with the same competitive unit. 8. Select the first vector pair in the training set (the current pattern). 9. Repeat steps 2 to 4 (normalization, competition) for the current pattern. 10. Adjust the connection weights between the winning hidden-layer unit and all M output layer units according to the equation: O O O wmW (t 1) wmW (t ) ( ym wmW (t )) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 8 The Counterpropagation Network 11. Repeat steps 9 and 10 for each vector pair in the training set. 12. Repeat steps 8 through 11 for several epochs. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 9 The Counterpropagation Network Because in our example network the input is twodimensional, each unit in the hidden layer has two weights (one for each input connection). Therefore, input to the network as well as weights of hidden-layer units can be represented and visualized by two-dimensional vectors. For the current network, all weights in the hidden layer can be completely described by three 2D vectors. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 10 Counterpropagation – Cosine Similarity This diagram shows a sample state of the hidden layer and a sample input to the network: H H ( w21 , w22 ) ( w11H , w12H ) ( x1 , x2 ) H H ( w31 , w32 ) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 11 Counterpropagation – Cosine Similarity In this example, hidden-layer neuron H2 wins and, according to the learning rule, is moved closer towards the current input vector. H H ( w21 , w22 ) ( w11H , w12H ) H H (w21 , w , new 22, new ) ( x1 , x2 ) H H ( w31 , w32 ) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 12 Counterpropagation – Cosine Similarity After doing this through many epochs and slowly reducing the adaptation step size , each hidden-layer unit will win for a subset of inputs, and the angle of its weight vector will be in the center of gravity of the angles of these inputs. all input vectors in the training set H H ( w21 , w22 ) ( w11H , w12H ) H H ( w31 , w32 ) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 13 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x 1 x November 9, 2010 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 14 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x 1 x November 9, 2010 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 15 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x x November 9, 2010 1 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 16 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x x November 9, 2010 1 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 17 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x x November 9, 2010 1 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 18 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 3 x x November 9, 2010 1 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 19 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 1 3 x x November 9, 2010 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 20 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + + + 1 3 x x November 9, 2010 x x Neural Networks Lecture 16: Counterpropagation 2 o o o o 21 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o o 3 x x November 9, 2010 2 x x Neural Networks Lecture 16: Counterpropagation 22 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o o 3 x x November 9, 2010 2 x x Neural Networks Lecture 16: Counterpropagation 23 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o o x 3x x x November 9, 2010 2 Neural Networks Lecture 16: Counterpropagation 24 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o o x 3x x x November 9, 2010 2 Neural Networks Lecture 16: Counterpropagation 25 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o2 o x 3x x x November 9, 2010 Neural Networks Lecture 16: Counterpropagation 26 Counterpropagation – Euclidean Distance … and so on, possibly with reduction of the learning rate… November 9, 2010 Neural Networks Lecture 16: Counterpropagation 27 Counterpropagation – Euclidean Distance Example of competitive learning with three hidden neurons: + + + 1+ + o o o o2 x x x 3x November 9, 2010 Neural Networks Lecture 16: Counterpropagation 28 The Counterpropagation Network After the first phase of the training, each hidden-layer neuron is associated with a subset of input vectors. The training process minimized the average angle difference or Euclidean distance between the weight vectors and their associated input vectors. In the second phase of the training, we adjust the weights in the network’s output layer in such a way that, for any winning hidden-layer unit, the network’s output is as close as possible to the desired output for the winning unit’s associated input vectors. The idea is that when we later use the network to compute functions, the output of the winning hiddenlayer unit is 1, and the output of all other hidden-layer units is 0. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 29 Counterpropagation – Cosine Similarity Because there are two output neurons, the weights in the output layer that receive input from the same hidden-layer unit can also be described by 2D vectors. These weight vectors are the only possible output vectors of our network. network output if H3 wins O ( w13O , w23 ) network output if H1 wins O ( w11O , w21 ) O ( w12O , w22 ) network output if H2 wins November 9, 2010 Neural Networks Lecture 16: Counterpropagation 30 Counterpropagation – Cosine Similarity For each input vector, the output-layer weights that are connected to the winning hidden-layer unit are made more similar to the desired output vector: O ( w13O , w23 ) O ( w11O , w21 ) O ( w12O , w22 ) O (w11O,new , w21 , new ) ( y1 , y2 ) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 31 Counterpropagation – Cosine Similarity The training proceeds with decreasing step size , and after its termination, the weight vectors are in the center of gravity of their associated output vectors: O 13 O 23 (w , w ) Output associated with H1 H2 H3 O ( w12O , w22 ) O ( w11O , w21 ) November 9, 2010 Neural Networks Lecture 16: Counterpropagation 32 Counterpropagation – Euclidean Distance At the end of the output-layer learning process, the outputs of the network are at the center of gravity of the desired outputs of the winner neuron. o o 2o o x x x x3 + + + 1+ + November 9, 2010 Neural Networks Lecture 16: Counterpropagation 33 The Counterpropagation Network Notice: • In the first training phase, if a hidden-layer unit does not win for a long period of time, its weights should be set to random values to give that unit a chance to win subsequently. • It is useful to reduce the learning rates , during training. • There is no need for normalizing the training output vectors. • After the training has finished, the network maps the training inputs onto output vectors that are close to the desired ones. • The more hidden units, the better the mapping; however, the generalization ability may decrease. • Thanks to the competitive neurons in the hidden layer, even linear neurons can realize nonlinear mappings. November 9, 2010 Neural Networks Lecture 16: Counterpropagation 34