FKNN

advertisement
Feature extraction using
fuzzy complete linear
discriminant analysis
The reporter:Cui Yan
2012. 4. 26
The report outlines
1.The fuzzy K-nearest neighbor classifier
(FKNN)
2.The fuzzy complete linear discriminant
analysis
3.Expriments
The Fuzzy K-nearest neighbor
classifier
(FKNN)
The K-nearest neighbor classifier (KNN)
Each sample should be classified
similarly to its surrounding samples,
therefore, a unknown sample could be
predicated by considering the
classification of its nearest neighbor
samples.
KNN tries to classify an unknown sample based
on its k-known classification neighbors.
FKNN
Given a sample set X  {x1, x2 ,, xn } , a fuzzy M
-class partition of these vectors specify the
membership degrees of each sample corresponding to each class.
The membership degree of a training vector
xij to o each of M classes is specified by uij ,
which is computed by the following steps:
Step 1: Compute the distance matrix between
pairs of feature vectors in the training.
Step 2: Set diagonal elements of this matrix to
infinity (practically place large numeric
values there).
Step 3: Sort the distance matrix (treat each of
its column separately) in an ascending
order. Collect the class labels of the
patterns located in the closest neighborhood of the pattern under consideration (as we are concerned with k
neighbors, this returns a list of k
integers).
Step 4: Compute the membership grade to class i
for j-th pattern using the expression proposed
in [1].
0.51 0.49* ( nij k ) if i  thelabel of the j-th pattern.
uij  
0.49* ( nij k )
if i  thelabel of the j-th pattern.
[1] J.M. Keller, M.R. Gray, J.A. Givens, A fuzzy k-nearest neighbor algorithm,
IEEE Trans. Syst.Man Cybernet. 1985, 15(4):580-585
A example for FKNN
No.
Feature1
Feature2
class
1
2
3
4
5
6
7
8
9
0.2000
0.3000
0.4000
0.5000
0.6000
0.5000
0.7000
0.8000
0.7000
0.3000
0.2000
0.3000
0.5000
0.4000
0.6000
0.3000
0.4000
0.5000
1
1
1
2
2
2
3
3
3
No.
1
2
3
4
5
6
7
8
9
1
0
0.1414
0.2000
0.3606
0.4123
0.4243
0.5000
0.6083
0.5385
2
3
4
5
6
7
8
9
0.1414
0.2000
0.3606
0.4123
0.4243
0.5000
0.6083
0.5385
0
0.1414
0.3606
0.3606
0.4472
0.4123
0.5385
0.5000
0.1414
0
0.2236
0.2236
0.3162
0.3000
0.4123
0.3606
0.3606
0.2236
0
0.1414
0.1000
0.2828
0.3162
0.2000
0.3606
0.2236
0.1414
0
0.2236
0.1414
0.2000
0.1414
0.4472
0.3162
0.1000
0.2236
0
0.3606
0.3606
0.2236
0.4123
0.3000
0.2828
0.1414
0.3606
0
0.1414
0.2000
0.5385
0.4123
0.3162
0.2000
0.3606
0.1414
0
0.1414
0.5000
0.3606
0.2000
0.1414
0.2236
0.2000
0.1414
0
1
2
3
4
5
6
7
8
9
0
0.1414
0.2000
0.3606
0.4123
0.4243
0.5000
0.5385
0.6083
0
0.1414
0.1414
0.3606
0.3606
0.4123
0.4472
0.5000
0.5385
0
0.1414
0.2000
0.2236
0.2236
0.3000
0.3162
0.3606
0.4123
0
0.1000
0.1414
0.2000
0.2236
0.2828
0.3162
0.3606
0.3606
0
0.1414
0.1414
0.1414
0.2000
0.2236
0.2236
0.3606
0.4123
0
0.1000
0.2236
0.2236
0.3162
0.3606
0.3606
0.4243
0.4472
0
0.1414
0.1414
0.2000
0.2828
0.3000
0.3606
0.4123
0.5000
0
0.1414
0.1414
0.2000
0.3162
0.3606
0.4123
0.5385
0.6083
0
0.1414
0.1414
0.2000
0.2000
0.2236
0.3606
0.5000
0.5385
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
2
1
3
4
5
7
6
3
2
1
5
4
7
6
4
6
5
9
3
7
8
5
4
9
7
8
6
3
9
8
9
8
9
8
1
2
2
1
6
4
5
9
3
7
8
1
7
5
8
9
4
3
6
2
8
9
7
5
4
6
3
2
9
5
8
4
7
6
3
2
2
1
1
1
1
2
3
4
5
6
7
8
9
1
1
1
2
2
2
3
1
1
1
2
2
3
2
1
1
1
2
2
3
2
2
2
2
3
1
3
3
2
2
3
3
3
2
1
3
3
3
3
3
3
1
1
1
1
2
2
2
3
1
3
3
1
3
2
3
3
2
1
2
1
3
3
3
2
2
2
1
1
3
2
3
2
3
2
1
1
1
1
1
1
Set k=3
1
2
3
4
5
6
7
8
9
1
1
2
1
1
2
1
1
2
2
2
3
2
3
3
2
2
3
2
3
3
3
3
2
2
3
2
1
2
3
4
5
6
7
8
9
0.8367 0.8367 0.8367 0
0
0
0
0
0
0.1633 0.1633 0.1633 0.8367 0.6733 0.8367 0.1633 0.1633 0.3267
0
0
0
0.1633 0.3267 0.1633 0.8367 0.8367 0.6733
The fuzzy complete linear
discriminant analysis
For the training set X  {x1, x2 ,, xn}, we define
the i-th class mean by combining the fuzzy
membership degree as



n
mi
u
x
ij
j
j 1
n
,
i  1,2,, c.
(1)
u
j 1 ij
And the total mean as
m
1
n

n
x
i 1 i
(2)
Incorporating the fuzzy membership degree, the
between-class, the within-class and the total
class fuzzy scatter matrix of samples can be
defined as
SbF  i 1  j 1 uij (mi  m)(mi  m)
c
n
T
S wF  i 1  jN uij ( x j  mi )(x j  mi )T
c
i
StF  i 1  jN uij ( x j  m)(x j  m)
c
i
T
(3)
Algorithm of the fuzzy complete linear analysis
step1: Calculate the membership degree matrix U
by the FKNN algorithm.
step 2: According toEqs.(1)-(3) work out the
between-class, within-class and total class
fuzzy scatter matrices.
step 3: Work out the orthogonal eigenvectors
p1, . . . , pl of the total class fuzzy scatter
matrix StF corresponding to positive
eigenvalues.
step 4: Let P = (p1, . . . , pl) and Sˆ  PT S P,
wF
wF
T
ˆ
SbF  P SbF P, work out the orthogonal
eigenvectors g1, . . . , gr of SˆwF correspending
the zero eigenvalues.
step 5: Let P1 = (g1, . . . , gr) and S~bF  P1T SˆbF P1 , work
out the orthogonal eigenvectors v1, . . . , vr of
~ , calculate the irregular discriminant
SbF
vectors wir by wir  PP1v .
step 6: Work out the orthogonal eigenvectors q1,…, qs
of Sˆ correspending the non-zero eigenvalues.
step 7: Let P2 = (q1,…, qs) and SwF  P2T SˆwF P2,
SbF  P2T SˆbF P2 , work out the optimal
discriminant vectors vr+1, . . . , vr+s by the
Fisher LDA, calculate the regular discriminant
vectors wr by wr  PP2v.
step 8: (Recognition): Project all samples into the
obtained optimal discriminant vectors and
classify.
wF
Experiments
• We compare Fuzzy-CLDA with CLDA, UWLDA,
FLDA, Fuzzy Fisherface, FIFDA on 3 different data
sets from the UCI data sources. The characteristics
of the three datasets can be found from
(http://archive.ics.uci.edu/ml/datasets).
• All data sets are randomly split to the train set and
test set with the ratio 1:4. Experiments are repeated
25 times to obtain mean prediction error rate as a
performance measure, NCC is adopted to classify
the test samples by using L2 norm.
Thanks!
2012. 4. 26
Download