Competitive Nets

advertisement
IE 585
Competitive Network – I
Hamming Net & Self-Organizing
Map
Competitive Nets
Unsupervised
• MAXNET
• Hamming Net
• Mexican Hat Net
• Self-Organizing Map (SOM)
• Adaptive Resonance Theory (ART)
Supervised
• Learning Vector Quantization (LVQ)
• Counterpropagation
2
Clustering Net
• Number of input neurons equal to the dimension
of input vectors
• Each output neuron represents a cluster
 the number of output neurons limits the
number of clusters that can be performed
• The weight vector for an output neuron serves as
a representative for the input patterns which the
net has placed on that cluster
• The weight vector for the winning neuron is
adjusted
3
Winner-Take-All
• The squared Euclidean distance is used to
determine the closest weight vector to a
pattern vector
• Only the neuron with the smallest
Euclidean distance from the input vector is
allowed to update
4
MAXNET
• Developed by Lippmann, 1987
• Can be used as a subset to pick the node
whose input is the largest
• Completely interconnected (including selfconnection)
• Symmetric weights
• No training
• Weights are fixed
5
Architecture of MAXNET
1/ 
1
1
1/ 
1/ 
1/ 
T ransferFunction
1/ 
1/ 
1
if i  j
1
wij  
  1 /  if i  j
1
x
f ( x)  
0
if x  0
otherwise
6
Procedure of MAXNET
Initialize activations and weights
Update activation of each node
old
x new

f
(
w
x
 ij j )
j
j
new
x old

x
j
j
If more than one node has a nonzero
activation, continue; otherwise, stop
7
Hamming Net
• Developed by Lippmann, 1987
• A maximum likelihood classifier
• used to determine which of several exemplar
vectors is most similar to an input vector
• Exemplar vectors determine the weights of the
net
• Measure of similarity between the input vector
and the stored exemplar vectors is (n – HD
between the vectors)
8
Weights and Transfer
Function of Hamming Net
bipolar (-1,1)
1

wij  2 xi

n
 wb 
2

binary (0,1)
if xi  1
1

 wij 
if xi  0

1
wb  # of 0's in vector
T ransferFunction
(IdentityFunction)
x
f ( x)  
0
if x  0
otherwise
9
Architecture of Hamming Net
MAXNET
y1
y2
B
B
x1
x2
x3
x4
10
Procedure of the Hamming Net
Initialize weights to store the m exemplar
vectors
For each input vector x
compute net  b 
wx
Yj
j

ij i
i
initialize activation for MAXNET y j (0)  netY j
MAXNET iterates for find the best
match exemplar
11
Hamming Net Example
12
Mexican Hat Net
• Developed by Kohonen, 1989
• Positive weight with “cooperative neighboring”
neurons
• Negative weight with “competitive neighboring”
neurons
• Not connect with far away neurons
13
Teuvo Kohonen
• http://www.cis.hut.fi/teuvo/ (his
own home page)
• published his work starting in
1984
• LVQ - learning vector
quantization
• SOM - self organizing
map
Professor at
Helsinki Univ.
Finland
14
SOM
• Also called Topology-Preserving Maps or SelfOrganizing Feature Maps (SOFM)
• “Winner Take All” learning (also called competitive
learning)
• winner has the minimum Euclidean distance
• learning only takes place for winner
• final weights are at the centroids of each cluster
• Continuous inputs, continuous or 0/1 (winner take all)
outputs
• No bias, fully connected
• used for data mining and exploration
• supervised version exists
15
Architecture of SOM Net
O
W
I
U
N
T
P
U
P
T
U
S
T
(a’s)
S
n
Input
Layer
(y’s)
16
Kohonen Learning Rule
Derivation
min a  wwinner
2
 a  2 wa  w
2
2
 aw
E

 (
)  ( 2a-2w)
w
w
2
w   (a  w)
0.1    0.7
and usually decreasesduring t raining.
17
Kohonen Learning
new
.j
w
w
 w
w
 [a  w ]
old
.j
old
.j
old
.j
 a  (1   ) w
old
.j
18
Procedure of SOM
Initialize weights uniformly and normalize to unit length
Normalize inputs to unit length
Present an input vector x
calculate Euclidean distance between x and all
Kohonen neurons
select winning output neuron j (with the smallest
distance)
new
old
old
w

w


[
a

w
]
update the winning neuron
re-normalize weights to j (sometimes skipped)
present next training vector
19
Method
Normalize input vectors, a, by:
aki 
aki
a
2
ki
i
Normalize weight vectors, w, by:
w ji
wji 
2
w
 ji
i
Calculate distance from a to each w by:
dj 
 a
ki
i
 wji 
2
20
Min d wins (this is the winning neuron)
Update w of the min d neuron by:

 ji   aki  wold
 ji
wnewji  wold

Return to 2 and repeat for all input
vectors a
Reduce  if applicable
Repeat until weights converge (stop
changing)
21
SOM Example - 4 patterns
=0.25
p1
1
1
p2
1
0
p3
0
1
p4
0
0
neuron1
neuron2
neuron3
w1
w2
w1
w2
w1
w2
0.2
0.3
0.5
0.4
0.1
0.7
0.2
0.3
0.5
0.4
0.1
0.7
0.2
0.3
0.625
0.3
0.1
0.7
0.2
0.3
0.625
0.3
0.075
0.775
0.15
0.225
0.625
0.3
0.075
0.775
0.15
0.225
0.625
0.3
0.075
0.775
0.15
0.225 0.71875
0.225
0.075
0.775
0.15
0.225 0.71875
0.225 0.05625 0.83125
0.1125 0.16875 0.71875
0.225 0.05625 0.83125
0.1125 0.16875 0.71875
0.225 0.05625 0.83125
0.1125 0.16875 0.789063 0.16875 0.05625 0.83125
0.1125 0.16875 0.789063 0.16875 0.042188 0.873438
0.084375 0.126563 0.789063 0.16875 0.042188 0.873438
neuron4
dist1
dist2
dist3
dist4
w1
w2
0.5
0.6
1.13
0.61
0.9
0.41
0.625
0.7
0.73
0.41
1.3 0.630625
0.625
0.7
0.53 0.880625
0.1 0.480625
0.625
0.7
0.13 0.480625 0.60625 0.880625
0.625
0.7 1.323125 0.630625 0.90625 0.230625
0.71875
0.775 0.773125 0.230625 1.45625 0.679727
0.71875
0.775 0.623125 1.117227 0.05625 0.567227
0.71875
0.775 0.073125 0.567227 0.694141 1.117227
0.71875
0.775 1.478633 0.679727 0.919141 0.129727
0.789063 0.83125 0.816133 0.129727 1.581641 0.735471
0.789063 0.83125 0.703633 1.313596 0.031641 0.651096
0.789063 0.83125 0.041133 0.651096 0.764673 1.313596
0.789063 0.83125
22
Movement of 4 weight clusters
1
Neuron 3
0.9
Neuron 4
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Neuron 2
Neuron 1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
23
Adding a “conscience”
• prevents neurons from winning too many
training vectors using a bias (b) factor
• winner had min (d-b) where
bj=10(1/n-fj) (n=# output neurons)
fjnew=fjold+0.0001(yj-fjold)
finitial=1/n
• for neurons that win, b becomes negative and
for neurons that don’t win, b becomes positive
24
Supervised Version
• Same, except if the winning neuron is
“correct” use same weight update:
wnew = wold + (a - wold) and
• if winning neuron is “incorrect” use:
wnew = wold - (a - wold)
25
Download