Ch 6

advertisement
6-1
Chapter 6: Counterpropagation Network
I.
Combine competitive network and Grossberg’s outstar
structure
II. Simulating functional mapping, i.e., y =  ( x ) , where
 : a continue function , e.g., cos 
III. Backward-mapping, i.e., x =  1 ( y )
6-2
。 Forward-mapping CPN
6.1. CPN Building Blocks
Three major components:
1. Instar: hidden node with input weights
2. Competitive layer: hidden layer composed of instars
3. Outstar: a structure
6.1.2 The Instar
Input:
net  I t  w .
Assume I  w  1
6-3
○ Output: y  ay  b(net ) ,
a, b > 0
------ (A)
Solve the above equation and assume y(0) = 0
b
 y (t )  (net )(1  e )  y (1  e )
a
b
y  (net )
a
If input removed at t  , i.e., net = 0
 at
eq
 at
eq
From y  ay  b(net ),  y  ay
Solve it y (t )  y eq e  at
○ Learning of instar -- learn to respond maximally to a
particular input vector
w  cw  dI ,
c, d > 0
------- (6.8)
6-4
In the absence of
I
 w  cw
Called forgetfulness
To avoid forgetfulness, an alternative learning rule:
w  (cw  dI ) U (net )
------ (6.10)
1 net  0
where U (net )  
0 net  0
。 When net  0 ,
w
 cw  dI , w  ctw  d tI
t
Let   ct  d t. Then, w   ( I  w )
w
w (t  1)  w (t )  w  w (t )   ( I  w (t ))
。 Learn a cluster of patterns
Set initial weight vector to
some member of the cluster
Average
6-5
。 Learning step:
1. Select an input vector I i at random
2. Calculate w (t )   (I i - w (t ) )
3. Update w(t  1)  w(t )  w(t )
4. Repeat steps 1-3 for all input vectors in the cluster
5. Repeat 1-4 several times

Reduce the value of  as training proceeds
6.1.3. Competitive Layer
The hidden layer is a competitive layer formed by instars
6-6
An on-center off-surround system can be used to implement
the competition among a group of instars
。 The unit activation
xi   Axi  ( B  xi )  f ( xi )  neti ]
 xi   ( f ( xk )  net k ) 
 k i

----- (6.13)
where A, B: positive constant, f () : transfer function
Rearrange
xi   Axi  B  f ( xi )  neti ]  xi   f ( xk )   netk  ----- (6.13)’
 k

k
Sum over i,
x   Ax  ( B  x )  f x(k )  netk  , x   xi
 k

k
i
Let xk  xX k , where X : reflectance variable
k
x   Ax  ( B  x)   f ( xX k )   netk 
 k

k
----- (6.14)
6-7
From xi  xX i , xi  xX i  xX i  xX ii  xi  xX
----- (A)


Substitute (6.13)’ xi   Axi  B  f ( xi )  neti ]  xi  f ( xk )   netk 

k
k



and (6.14) x   Ax  ( B  x)   f ( xX k )   netk  into (A)

k

k


xX i  xi  xX i   Axi  B  f ( xi )  neti ]  xi  f ( xX k )   netk 
k
k



 { Ax  ( B  x)  f ( xX k )   netk } X i
k
k

--- (B)


(  AxX i   Axi )  B  f ( xi )  neti   xi   f ( xX k )   netk 
k
k

 ( B  x) X i  f ( xX k )  ( B  x) X i  netk
k
k
 Bf ( xi )  Bneti  xi  net k  BX i  f ( xX k )
k
k
 BX i  netk  xX i  net k
k
k
 Bf ( xX i )  BX i  f ( xX k )  Bneti  BX i  netk
k
(Let
k
f ( )   g ( ) )
 BxX i g ( xX i )  BX i  xX k g ( xX k )
k
 Bneti  BX i  netk
k
 BxX i  X k  g ( xX i )  g ( xX k )   Bneti
k
 BX i  netk  BX i neti
k i
6-8
 BxX i  X k  g ( xX i )  g ( xX k )
k
 B(1  X i )neti  BX i  netk
k i
 xX i  BxX i  X k  g ( xX i )  g ( xX k ) 
k
----- (6.15)
 B (1  X i )neti  BX i  netk
k i
where g ( w)  w1 f ( w) or f ()   g ()
。 Let f ( )   (linear)  g ( )  1
 BxX i  X k  g ( xX i )  g ( xX k )   0
k
xX i  B (1  X i )neti  BX i  netk  Bneti  BX i  netk
k i
k
X i stablizes (i.e., X i  0) at X i 
neti
 netk
k
xi  xX i  x eq
neti
 netk
k
 at equilibrium
。If the input pattern is removed (i.e., neti  0 ),
(6.14) => x   Ax  ( B  x) f ( xX k )
k
f ( xX k )  f ( xk )  xk (
(

 f ( xX
k
k
f ()  )
)   xk  x )
k
  Ax  ( B  x) x  ( B  A) x  x 2
6-9
If B  A, then x  0  x decreasing
If B  A, x increases until x  B  A, which
is then stored permanently on the units.
Called short-term memory
。 Example: f (w)  w (linear function)
6-10
。 Example: f ( w)  w2 (fast-than-linear output function)
1
1
g ( w)  w f ( w)  w w  w
2
 g ( xXi )  g ( xX )  xX i  xX  x[ X i  X ]
k
k
k
i. if X i  X k , X k [ g ( xX i )  g ( xX k )] is an
excitatory term for xi
Otherwise, an inhibitory term for xi
The network tends to enhance the activity of the unit with
the largest value of X i
6-11
ii. After the input pattern is removed , (6.13)’ =>
xi   Axi  Bf ( xi )  xi  f ( xk )
k
  Axi  Bxi2  xi  xk2  Bxi2  ( A   xk2 ) xi
k
k
With appropriate A and B, xi  0 for only
the unit with the largest xi
iii. f (w)  wn can be used to implement a
winner-take-all network when (n > 1)
。 Example: sigmoid function (contrast enhancement)
Quenching threshold (QT): units whose net inputs are above
the QT will have their activities enhanced
6-12
6.1.4. Outstar
-- is composed of a single hidden unit and all output units
6-13
○ Classical-conditioning
。 Hebbian rule
。 Initially, the conditioned stimulus (CS) is assumed to be
unable to elicit a response from any of the units to which
it is connected
An unconditioned stimulus (UCS) can cause an
unconditioned response (UCR)
6-14
If the CS is present while the UCS is causing the UCR ,
then the strength of the connection from the CS unit to the
UCR unit will also be increased
Later, the CS will be able to cause a conditioned response
(CR), even if the UCS is absent
。 During the training period, the winner of the competition
on the hidden layer turns on, providing a single CS to the
output units. The UCS is supplied by the Y portion of the
input layer.
After training is complete, the appearance of the CS will
cause the CR value to appear at the output units even
though the UCS values will be zero
6-15
。 Recognize an input pattern through a winner-take-all
competition. Once a winner has been declared, that
unit becomes the CS for an outstar.
The outstar associates some value or identity with the
input pattern
◎ The output values of the outstar
i, During training phase –
yi   a yi b 
iy c ni e t
where a, b, c  0, yi : training input
yi: the output of the outstar node i
Only a single hidden unit c (winner) due to competition
has a nonzero output at any given time.
 Net input to any output neuron i reduces to a single
term wic z (z 1 the output of the winner)
i.e., neti  z  wic  wic  yi  ayi  byi  cwic
ii, After training -- Training input y absent
i
 yi  ayi  cwi
----- (6.17)
6-16
◎ During training, the evolution of weights
wi  (dwi  eyi z)U ( z) (similar to instar)
Only the winner ( z 1, U ( z) 1) involves learning
wi  dwi  eyi
----- (6.19)
At equilibrium, wi  0 ,
wieq  e yi:
for identical output of the
d
members of a cluster
wieq  e  yi : for average output of the
d
members of a cluster
After training, (6.17)  yi  ayi  ce  yi 
d
At eguilibrium yieq  c e  yi 
ad
Want yieq  yi ,  a  c, d  e
yieq  yi  wieq
From (6.19), wi (t 1)  wi (t )   (eyi  dwi (t ))
 wi (t )   ( yi  wi (t ))
◎ Summary of training the CPN
Two learning algorithms for
(a)
Competitive layer (input – hidden layers)
(b) Output layer (hidden – output layers)
6-17
(a) Competitive layer training
1. Select an input vector I
2. Normalize Ι  x ( x 1)
3. Apply to the competitive layer
4. Determine the winner c
5. Calculate  ( x  wc ) for the winner
6. Update the winner weight vector
wc (t 1)  wc (t)  ( x  wc )
6-18
7. Repeat steps 1- 6, until all input vectors have been
processed
8. Repeat step 7 until all input vectors have been
classified properly
We may remove the hidden nodes, which haven’t
been used
(b) Output layer training
1. Apply a normalized input vector xk , and its
corresponding output vector yk , to the X and
Y portions of the CPN, respectively.
2. Determine the winner c
3. Update its associated weights
w (t 1)  w (t )   ( y  w (t )) , i 1, ..., m
ic
ic
i ic
4. Repeat steps 1- 3, until all vectors of all classes
map to satisfactory outputs
◎ Forward processing (production phase)
1. xi 
Ii
I
n
2
i
, x 1
2. Apply the input vector to X portion of the input layer
Apply zero vector to Y portion of the input layer
6-19
3. X portion units are distributed to the competitive layer
4. The winner unit has an output of 1
All other units have outputs of 0
5. Excites the corresponding outstar
* The output of the outstar units is the value of the weights
on the connection from the winner unit.
6.2.4. Complex CPN
Forward mapping x  y
The Full CPN:
Reverse mapping y  x





6-20
During training, both x and y are applied to the input units
input ( x, ) results in an output y  ( x)
After training,
input ( , y) results in an output x 





。 Let r: the weight vector from the portion of x inputs to any
hidden unit
s: the weight vector from the portion of y inputs to any
hidden unit
neti  r  x  s  y
For hidden unit i,
Its output
zi
1

 0
During training
ni efit
mn ae xt {
k
otherwise
k
}
ri   x ( x  ri ), si   y ( y  si )
Note that only the winner is allowed to learn for a given
input vector
。For output layer,
y units have weight vector w ,
i
x units have weight vector v
i
The learning laws:
wij   y ( yi  wij ), vij  x (xi  vij )
Note that only the winner is allowed to learn
Related documents
Download