Unsupervised Learning

advertisement

Unsupervised Learning

G.Anuradha

Contents

• Introduction

• Competitive Learning networks

• Kohenen self-organizing networks

• Learning vector quantization

• Hebbian learning

Introduction

• When no external teacher or critic’s information is available, only input vectors are used for learning

• Learning without supervision-unsupervised learning

• Learning evolves to extract features or regularities in presented patterns, without being told about the output classes

• Frequently employed in data clustering, feature extraction, similarity detection

Introduction

• Unsupervised learning NNs learn to respond to different input patterns with different parts of the network

• Trained to strengthen firing to respond to frequently occurring patterns

• Based on this concept we have competitive learning and kohenen selforganizing feature maps

Competitive learning networks

Competitive learning networks

• Weights in this case are updated based on the input pattern alone.

• The no. of inputs is the input dimension, while the no. of outputs is equal to the number of clusters that the data are to be divided into.

• Cluster center position=weight vector connected to the corresponding output unit.

Learning algorithm

• Learning algorithn used in most of these nets is known as kohonen learning

• The units update their weights by forming a new weight vector, which is a linear combination of the old weight vector and the new input vector.

• The weight updation formula for output cluster unit j is given by w k

(t+1)=w k

(t) + ɳ(x(t)-w k

(t))

Learning algorithm

• There are 2 methods to determine the winner of the network during competition

– Euclidian distance method

– Dot product method

• Euclidean distance method:-

– the square of Euclidean distance between the input vector and the weight vector is computed

– The unit whose weight vector is at the smallest

Euclidean distance from the input vector is chosen as the winner

Learning algorithm

• The dot product method:-

– The dot product between the input vector and weight vector is computed

– a j

= i

3 

1 x i w ij

– The output unit with the highest activation must be selected for further processing ( competitive)

– The weights of this unit are updated (Winner-take-all)

– In this case the weight updations are given by wk ( t

1 )

|| wk ( t ) wk ( t )

 

( x ( t )

 

( x ( t )

 wk ( t )) wk ( t )) ||

What happens in a dot product method?

• In the case of a dot product finding a minimum of x-wi is nothing but finding the maximum among the m scalar products

• Competitive learning network performs an on-line clustering process on the input patterns

• When process is complete the input data is divided into disjoint clusters

• With euclidean distance the update formula is actually an online gradient descent that minimizes the objective function

Competitive learning with unitlength vectors

The dots represent the input vectors and the crosses denote the weight vectors for the four output units

As the learning continues, the four weight vectors rotate toward the centers of the four input clusters

Limitations of competitive learning

• Weights are initialized to random values which might be far from any input vector and it never gets updated

– Can be prevented by initializing the weights to samples from the input data itself, thereby ensuring that all weights get updated when all the input patterns are presented

– Or else the weights of winning as well as losing neurons can be updated by tuning the learning constant by using a significantly smaller learning rate for the losers. This is called as leaky learning

– Note:Changing η is generally desired. An initial value of η explores the data space widely. Later on progressively smaller value refines the weights. Similar to the cooling schedule in simulated annealing.

Limitations of competitive learning

• Lacks the capability to add new clusters when deemed necessary

• If ɳ is constant –no stability of clusters

• If ɳ is decreasing with time may become too small to update cluster centers

• This is called as stability-plasticity dilemma (Solved using adaptive resonance theory (ART))

• If the output units are arranged in the form of a vector or matrix then the weights of winners as well as neighbouring losers can be updated.

( Kohenen feature maps)

• After learning the input space is divided into a number of disjoint clusters. These cluster centers are known as template or code book

• For any input pattern presented we can use an appropriate code book vector ( Vector

Quantization )

• This vector quantization is used in data compression in IP and communication systems.

Kohenen Self-Organizing networks

• Also known as Kohenen Feature maps or topology-preserving maps

• Learning procedure of Kohenen feature maps is similar to that of competitive learning networks.

• Similarity (dissimilarity) measure is selected and the winning unit is considered to be the one with the largest (smallest) activation

• The weights of the winning neuron as well as the neighborhood around the winning units are adjusted.

• Neighborhood size decreases slowly with every iteration.

Training of kohenon self organizing network

1. Select the winning output unit as the one with the largest similarity measure between all w i and x i

. The winning unit equation c satisfies the

||x-w c

||=min||x-w i

|| where the index c refers to the winning unit (Euclidean distance)

2. Let NB c denote a set of index corresponding to a neighborhood around winner c. The weights of the winner and it neighboring units are updated by

Δw i

=ɳ(x-w i

) i εNB c

• A neighborhood function around the winning unit can be used instead of defining the neighborhood of a winning unit.

• A Gaussian function can be used as neighborhood function

 c

( i )

 exp(

|| pi

2

2 pc ||

2

)

Where pi and pc are the positions of the output units i and c respectively and σ reflects the scope of the neighborhood.

The update formula using neighborhood function is given by

 wi

   c ( i )( x

 wi )

Stop

Test

(t+1)

Is reduced

Start

Initialize

Weights,

Learning rate

Reduce radius

Of network

Initialize

Topological

Neighborhood params

For

Each i/p

X

For i=1 to n

For j=1 to m

D(j)=Ʃ(x i

-w ij

) 2

Winning unit index J is computed

D(J)=minimum

Weights of winning unit continue continue

Reduce learning rate

Learning Vector Quantization

LVQ

• It is an adaptive data classification method based on training data with desired class information

• It is actually a supervised training method but employs unsupervised data-clustering techniques to preprocess the data set and obtain cluster centers

• Resembles a competitive learning network except that each output unit is associated with a class.

Network representation of LVQ

Possible data distributions and decision boundaries

LVQ learning algorithm

• Step 1: Initialize the cluster centers by a clustering method

• Step 2: Label each cluster by the voting method

• Step 3: Randomly select a training input vector x and find k such that ||x-w k

|| is a minimum

• Step 4: If x and w k update wk by

 wk belong to the same class

 

( x

 wk ) else

 wk

  

( x

 wk )

• The parameters used for the training process of a LVQ include the following

– x=training vector (x

1

,x

2

,……x n

)

– T=category or class for the training vector x

– w j

= weight vector for

( w

1j

,…w ij

….w

nj

) j th output unit

– c j

= cluster or class or category associated with j th output unit

– The Euclidean distance of j th output unit is

D( j )=Ʃ( x i

-w ij

) 2

Start

Initialize weight

Learning rate

A

For each i/p x

Y Calculate winner

Winner = min D(j)

Input T

B

If T=Cj w j

(n)=w j

(o) + ɳ[x-w j

(o)]

Y

N wj(n)=wj(o) ɳ[x-wj(o)]

Reduce ɳ ɳ(t+1)=0.5 ɳ(t)

A

Stop

Y

If ɳ reduces negligible

B

Download