Radial Basis Function Networks

advertisement
Radial Basis Function
Networks
1. Introduction
2. Finding RBF Parameters
3. Decision Surface of RBF Networks
4. Comparison between RBF and BP
4/15/2020
RBF Networks
M.W. Mak
1
1. Introduction


MLPs are highly non-linear in the parameter space
gradient descent  local minima
RBF networks solve this problem by dividing the
learning into two independent processes.
w

4/15/2020
RBF Networks
M.W. Mak
2

RBF networks implement the function
M

 
s( x )  w 0   w i  i ( x  c i )
i1


wi i and ci can be determined separately
 Fast learning algorithm
Basis function types
 ( r )  r 2 log( r )
2
r
 ( r )  exp( 

2
)
(r )  r 2   2
(r )  1
4/15/2020
r2  2
RBF Networks
M.W. Mak
3

For Gaussian basis functions

M



s( x p )  w 0   w i  i x p  c i
i 1

 n ( x pj  c ij ) 2 
 w 0   w i exp  

2
2 ij
i 1
 j 1

M

Assume the variance  across each dimension are
equal
M
 1

s( x p )  w 0   w i exp  2
i 1
 2 i
4/15/2020
RBF Networks

( x pj  c ij ) 

j 1

n
2
M.W. Mak
4

To write in matrix form, let



a pi   i x p  c i

 s( x p ) 
M
w a
i
pi

where a p 0  1
i0

 s( x1 )   1 a11
 s( x )   1 a
2
21



   
 s( x )  1 a
 N  
N1
a`12
a22

aN 2
 a1 M   w 0 
 a2M   w1 



   
 a NM   w M 
s  Aw
 w  A 1 s
4/15/2020
RBF Networks
M.W. Mak
5
2. Finding the RBF Parameters

Use the K-mean algorithm to find ci

2


2
1

1

1
4/15/2020
RBF Networks

2
M.W. Mak
6
K-mean Algorithm
step1: K initial clusters are chosen randomly from the samples
to form K groups.
step2: Each new sample is added to the group whose mean is
the closest to this sample.
step3: Adjust the mean of the group to take account of the new
points.
step4: Repeat step2 until the distance between the old means
and the new means of all clusters is smaller than a
predefined tolerance.
4/15/2020
RBF Networks
M.W. Mak
7
Outcome: There are K clusters with means representing
the centroid of each clusters.
Advantages: (1) A fast and simple algorithm.
(2) Reduce the effects of noisy samples.
4/15/2020
RBF Networks
M.W. Mak
8

Use K nearest neighbor rule to find the function
width 
i 
1
K
K

k 1
 
ck  ci
2
k-th nearest neighbor of ci

The objective is to cover the training points so that a
smooth fit of the training samples can be achieved
4/15/2020
RBF Networks
M.W. Mak
9
Centers and widths found by K-means and K-NN
4/15/2020
RBF Networks
M.W. Mak
10

Determining weights w using the least square
method
2
N
M


 
E   d p   w j  j x p  c j 
p1 
j0



where dp is the desired output for pattern p
E  ( d  Aw )T ( d  Aw )
Set
E
 0  w  ( A T A ) 1 A T d
w
4/15/2020
RBF Networks
M.W. Mak
11
Let E be the total-squared error between the

actual output and the target output d  d1 d2  d N T




T 

 E  d  Aw d  Aw
 d T  wT AT d  Aw
 d T d  wT AT d  d T Aw  wT AT Aw
E

 T




 0
wT AT d  
d Aw 
wT AT Aw
w
w
w
w
T
 T T

  AT d 
w A d   AT Aw  AT A w
w
 2 AT d  2 AT Aw
 AT Aw  AT d

 w  AT A AT d
1
4/15/2020
RBF Networks
M.W. Mak
12
Note that
 T  
 x y   y
x

 T 
 x Ay   Ay
x

 T 
T
 x Ax   Ax  A x
x
Problems
(1) Susceptible to round-off error.
(2) No solution if AT A is singular.
(3) If AT A is close to singular, we get very large component
in w.
4/15/2020
RBF Networks
M.W. Mak
13
Reasons
(1) Inaccuracy in forming AT A
(2) If A is ill-conditioned, small change in A introduces
1
large change in AT A
(3) If ATA is close to singular, dependent columns in ATA
exist
e.g. two parallel straight lines.
y
x
4/15/2020
RBF Networks
M.W. Mak
14
singular matrix :
1 2  x  0
2 4  y   1

   
If the lines are nearly parallel, they intersect each other at
 ,
i.e.  x0   
 y     

 0 
or
 x0   
 y    

 0 
So, the magnitude of the solution becomes very large; hence
overflow will occur.
The effect of the large components can be cancelled out if the
machine precision is infinite.
4/15/2020
RBF Networks
M.W. Mak
15
If the machine precision is finite, we get large error.
For example,
2   4  1038  0
1
 
  1  2 
38 

  2  10  0
Finite machine precision =>
2  4.00001  1038   1  1033 
1

  1  2 
38
33 

2

10

1

10


 

Solution: Singular Value Decomposition
4/15/2020
RBF Networks
M.W. Mak
16

RBF learning process
K-means
ci
A
Basis
Functions
xp
ci
K-Nearest
Neighbor
4/15/2020
Linear
Regression
w
i
RBF Networks
M.W. Mak
17

RBF learning by gradient descent
2

n
x pj  c ij  

1

 and e ( x )  d ( x )  s( x )
Let  i ( x p )  exp  
p
p
p
2
 2 j 1


ij


we have
2
1

E    e ( x p ) .
2 p1
N
Apply
E E
E
,
, and
w i  ij
c ij
4/15/2020
RBF Networks
M.W. Mak
18
we have the following update equations
N


w i ( t  1)  w i ( t )   w  e ( x p ) i ( x p ) when i  1,2,  , M
p1
N

w i ( t  1)  w i ( t )   w  e ( x p ) when i  0
p1


2


 ij ( t  1)   ij ( t )     e ( x p )w i  i ( x p ) x pj  c ij   ij ( t )
N
p1




c ij ( t  1)  c ij ( t )   c  e ( x p )w i  i ( x p ) x pj  c ij   ij ( t )
N
p1
4/15/2020
RBF Networks
M.W. Mak
3
2
19
Elliptical
Basis
Function
networks


yK ( x )
y1 ( x )


J


yk ( x p )   wkj  j ( x p )
j 0
1
2
x1
M
x2
y  W  W =  +D
xn

 T 1 

1 
 j ( x p )  exp{ ( x p   j )  j ( x p   j )}
2

j
: function centers
 j : covariance matrix
4/15/2020
RBF Networks
M.W. Mak
20

K-means and Sample covariance
K-means :


 j   j 



x   j if x   j
Sample covariance :
1
j 
Nj

1

x

N j x  j


 x  k j  k

 T


 ( x   j )( x   j )
x  j
The EM algorithm
4/15/2020
RBF Networks
M.W. Mak
21
EBF Vs. RBF networks
3
3
Class 1
Class 2
Class 1
Class 2
2
2
1
1
0
0
-1
-1
-2
-2
-3
-3
-2
-1
0
1
2
3
-3
-3
RBFN with 4 centers
4/15/2020
RBF Networks
-2
-1
0
1
2
3
EBFN with 4 centers
M.W. Mak
22
Elliptical Basis Function Networks
Out put 1 of an EBF net work (bias, no rescale, gamma=1)
'nxor.ebf 4.Y.N.1.dat '
1.43
0.948
0.463
-0.0209
-0.505
EBF Network’s output
2
1.5
1
0.5
0
-0.5
-1
3
2
1
-3
-2
0
-1
4/15/2020
0
-1
1
-2
2
RBF Networks
3
-3
M.W. Mak
23
RBFN for Pattern Classification
Hyperplane
MLP
Kernel function
RBF
The probability density function (also called conditional density
function or likelihood) of the k-th class is defined as

px | Ck 
4/15/2020
RBF Networks
M.W. Mak
24
•According to Bays’ theorem, the posterior prob. is


p x | Ck PCk 
PCk | x  

px 
where P(Ck) is the prior prob. and


px    p( x | Cr )P(Cr )
r
• It is possible to use a common pool of M basis functions,
labeled by an index j, to represent all of the class-conditional
densities, i.e.
M


px | Ck    p( x | j )P( j | Ck )
j 1
4/15/2020
RBF Networks
M.W. Mak
25
p( x | Ck )
P( M | Ck )
p(x | 1)
p(x | 2)
p( x | M )
M


px | Ck    px | j P j | Ck 
j 1
4/15/2020
RBF Networks
M.W. Mak
26

M


px    px | j P j | Ck PCk 
j 1
k

  p  x | j  P  j | Ck P Ck 
M
j 1
k

  p  x | j P  j 
M
j 1


p
x
 | j P j | Ck PCk 
M


P Ck | x  
j 1
 '

p
x
 | j P j ' 
M
P j 

P j 
j 1
4/15/2020
RBF Networks
M.W. Mak
27
P  j | Ck P Ck 


P j 
j 1
M

p  x | j P  j 
M
 '

p
x
 | j P j ' 
j 1

  P Ck | j P  j | x 
M
j 1

  wkj  j  x 
M
j 1
No bias term


Hidden node’s output  j ( x )  P( j | x ) : posterior prob. of the j-th set of
features in the input .
weight wkj  P(Ck | j ) : posterior prob. of class membership, given
the presence of the j- th set of features .
4/15/2020
RBF Networks
M.W. Mak
28
Comparison of RBF and MLP
RBF networks
MLP
Learning speed
Very Fast
Very Slow
Convergence
Almost guarantee
Not guarantee
Response time
Slow
Fast
Memory
requirement
Very large
Small
Hardware
implementation
IBM ZISC036
Nestor Ni1000
Voice Direct 364
www.sensoryinc.com
www-5.ibm.com/fr/cdlab/zisc.html
Generalization
Usually better
Usually poorer
To learn more about NN hardware, see
http://www.particle.kth.se/~lindsey/HardwareNNWCourse/home.html
4/15/2020
RBF Networks
M.W. Mak
29
Download