G.Anuradha
• Introduction
• Competitive Learning networks
• Kohenen self-organizing networks
• Learning vector quantization
• Hebbian learning
• When no external teacher or critic’s information is available, only input vectors are used for learning
• Learning without supervision-unsupervised learning
• Learning evolves to extract features or regularities in presented patterns, without being told about the output classes
• Frequently employed in data clustering, feature extraction, similarity detection
• Unsupervised learning NNs learn to respond to different input patterns with different parts of the network
• Trained to strengthen firing to respond to frequently occurring patterns
• Based on this concept we have competitive learning and kohenen selforganizing feature maps
• Weights in this case are updated based on the input pattern alone.
• The no. of inputs is the input dimension, while the no. of outputs is equal to the number of clusters that the data are to be divided into.
• Cluster center position=weight vector connected to the corresponding output unit.
• Learning algorithn used in most of these nets is known as kohonen learning
• The units update their weights by forming a new weight vector, which is a linear combination of the old weight vector and the new input vector.
• The weight updation formula for output cluster unit j is given by w k
(t+1)=w k
(t) + ɳ(x(t)-w k
(t))
• There are 2 methods to determine the winner of the network during competition
– Euclidian distance method
– Dot product method
• Euclidean distance method:-
– the square of Euclidean distance between the input vector and the weight vector is computed
– The unit whose weight vector is at the smallest
Euclidean distance from the input vector is chosen as the winner
• The dot product method:-
– The dot product between the input vector and weight vector is computed
– a j
= i
3
1 x i w ij
– The output unit with the highest activation must be selected for further processing ( competitive)
– The weights of this unit are updated (Winner-take-all)
– In this case the weight updations are given by wk ( t
1 )
|| wk ( t ) wk ( t )
( x ( t )
( x ( t )
wk ( t )) wk ( t )) ||
• In the case of a dot product finding a minimum of x-wi is nothing but finding the maximum among the m scalar products
• Competitive learning network performs an on-line clustering process on the input patterns
• When process is complete the input data is divided into disjoint clusters
• With euclidean distance the update formula is actually an online gradient descent that minimizes the objective function
The dots represent the input vectors and the crosses denote the weight vectors for the four output units
As the learning continues, the four weight vectors rotate toward the centers of the four input clusters
• Weights are initialized to random values which might be far from any input vector and it never gets updated
– Can be prevented by initializing the weights to samples from the input data itself, thereby ensuring that all weights get updated when all the input patterns are presented
– Or else the weights of winning as well as losing neurons can be updated by tuning the learning constant by using a significantly smaller learning rate for the losers. This is called as leaky learning
– Note:Changing η is generally desired. An initial value of η explores the data space widely. Later on progressively smaller value refines the weights. Similar to the cooling schedule in simulated annealing.
• Lacks the capability to add new clusters when deemed necessary
• If ɳ is constant –no stability of clusters
• If ɳ is decreasing with time may become too small to update cluster centers
• This is called as stability-plasticity dilemma (Solved using adaptive resonance theory (ART))
• If the output units are arranged in the form of a vector or matrix then the weights of winners as well as neighbouring losers can be updated.
( Kohenen feature maps)
• After learning the input space is divided into a number of disjoint clusters. These cluster centers are known as template or code book
• For any input pattern presented we can use an appropriate code book vector ( Vector
Quantization )
• This vector quantization is used in data compression in IP and communication systems.
• Also known as Kohenen Feature maps or topology-preserving maps
• Learning procedure of Kohenen feature maps is similar to that of competitive learning networks.
• Similarity (dissimilarity) measure is selected and the winning unit is considered to be the one with the largest (smallest) activation
• The weights of the winning neuron as well as the neighborhood around the winning units are adjusted.
• Neighborhood size decreases slowly with every iteration.
1. Select the winning output unit as the one with the largest similarity measure between all w i and x i
. The winning unit equation c satisfies the
||x-w c
||=min||x-w i
|| where the index c refers to the winning unit (Euclidean distance)
2. Let NB c denote a set of index corresponding to a neighborhood around winner c. The weights of the winner and it neighboring units are updated by
Δw i
=ɳ(x-w i
) i εNB c
• A neighborhood function around the winning unit can be used instead of defining the neighborhood of a winning unit.
• A Gaussian function can be used as neighborhood function
c
( i )
exp(
|| pi
2
2 pc ||
2
)
Where pi and pc are the positions of the output units i and c respectively and σ reflects the scope of the neighborhood.
The update formula using neighborhood function is given by
wi
c ( i )( x
wi )
Stop
Test
(t+1)
Is reduced
Start
Initialize
Weights,
Learning rate
Reduce radius
Of network
Initialize
Topological
Neighborhood params
For
Each i/p
X
For i=1 to n
For j=1 to m
D(j)=Ʃ(x i
-w ij
) 2
Winning unit index J is computed
D(J)=minimum
Weights of winning unit continue continue
Reduce learning rate
LVQ
• It is an adaptive data classification method based on training data with desired class information
• It is actually a supervised training method but employs unsupervised data-clustering techniques to preprocess the data set and obtain cluster centers
• Resembles a competitive learning network except that each output unit is associated with a class.
• Step 1: Initialize the cluster centers by a clustering method
• Step 2: Label each cluster by the voting method
• Step 3: Randomly select a training input vector x and find k such that ||x-w k
|| is a minimum
• Step 4: If x and w k update wk by
wk belong to the same class
( x
wk ) else
wk
( x
wk )
• The parameters used for the training process of a LVQ include the following
– x=training vector (x
1
,x
2
,……x n
)
– T=category or class for the training vector x
– w j
= weight vector for
( w
1j
,…w ij
….w
nj
) j th output unit
– c j
= cluster or class or category associated with j th output unit
– The Euclidean distance of j th output unit is
D( j )=Ʃ( x i
-w ij
) 2
Start
Initialize weight
Learning rate
A
For each i/p x
Y Calculate winner
Winner = min D(j)
Input T
B
If T=Cj w j
(n)=w j
(o) + ɳ[x-w j
(o)]
Y
N wj(n)=wj(o) ɳ[x-wj(o)]
Reduce ɳ ɳ(t+1)=0.5 ɳ(t)
A
Stop
Y
If ɳ reduces negligible
B