Ranking Projection

advertisement
Ranking Projection
Zhi-Sheng Chen
2010/02/03
1/30
Multi-Media Information Lab, NTHU
Introduction

Ranking is everywhere



Retrieval for music, image, video, sound, … etc
Scoring for speech, multimedia… etc
Find a projection that


2/30
Preserves the given order of the data
Reduces the dimensionality of the data
Multi-Media Information Lab, NTHU
The Basic Criteria of Linear Ranking
Projection

Given the ranking order (c1, c2, c3, c4). In the projection
space, we have the criteria
d c1 , c2   d c1 , c3   d c1 , c2   d c1 , c3   0  d123  0
d c1 , c2   d c1 , c4   d c1 , c2   d c1 , c4   0  d124  0
d c1 , c3   d c1 , c4   d c1 , c3   d c1 , c4   0  d134  0
d c2 , c3   d c2 , c4   d c2 , c3   d c2 , c4   0  d 234  0
min J  d123  d124  d134  d 234


3/30
Where d(.,.) is the distance measure between two classes
In our cases we use the difference of the means
Multi-Media Information Lab, NTHU
The Basic Criteria of Linear Ranking
Projection

Let a be the projection vector, the previous criteria can
be rewritten as
min J a 
subject to a T a  1
J a   a T M 12  M 13   M 12  M 14   M 13  M 14   M 23  M 24 a
M ij  mi  m j mi  m j 
T
4/30
Multi-Media Information Lab, NTHU
The Ordinal Weights

Roughly speaking, these distances measure have different
importance according to their order.



5/30
Ex:
d123 is more importance than
d124 is more importance than
d134 is more importance than
d123 is more importance than
How about d 239 and d 245 ?
d124
d134
d 234
d 234
Instead of finding the precisely rules of ordinal weights, we use
a roughly ordinal weighted rule
Multi-Media Information Lab, NTHU
The Ordinal Weights



Given a ranking order, we define a score to each term.
The largest and the smallest scores indicate the top and
the latest terms of the order.
Simply define the ordinal weight function as
ws1 , s2 , s3   s1  s2  s3
So the weighted criteria becomes
min J a 
subject to aT a  1
 ws1 , s2 , s3 M 12  M 13   ws1 , s2 , s4 M 12  M 14   
J a   a 
a

 ws1 , s3 , s4 M 13  M 14   ws2 , s3 , s4 M 23  M 24  
T
M ij  mi  m j mi  m j 
T
6/30
Multi-Media Information Lab, NTHU
Some Results for Weighted Criteria
(c1, c2, c3, c4)

8
8
6
6
C1
4
C2
C1
4
2
C2
C3
0
2
C4
-2
C3
-4
0
C4
-6
-2
-8
-10
-8
-6
-4
-2
0
2
4
6
8
10
-6
7/30
-4
-2
Multi-Media Information Lab, NTHU
0
2
4
6
8
Some Results for Weighted Criteria

(c3, c1, c4, c2)
7
6
6
C1
4
C2
5
C3
4
2
C1
3
0
C4
2
-2
C3
-4
0
C4
-1
-6
-2
-8

C2
1
-6
-4
-2
0
2
4
6
8
-6
-4
-2
0
2
4
For the projection onto more than one-dim, the solution
becomes selecting the kth eigenvectors w.r.t. the smallest
kth eigenvalues
8/30
Multi-Media Information Lab, NTHU
6
Class with several groups

We may not care the order of some groups of the data
points within the class
10
8
6
4
C3
2
C2
0
C1
-2
-4
-6
-8
-10
-15
9/30
-10
-5
0
5
Multi-Media Information Lab, NTHU
Grouped Classes

For the above case, let the order be (c1, c2, c3), then the
criteria becomes
min J a 
subject to a T a  1


J a   a   wsi , s j , sk M ij , pq  M ik ,rs  a
i , j , k , p , q , r , s

T
M ij , pq  mi , p  m j ,q mi , p  m j ,q 
T
mi , p is the mean vecto r of the group p in the class i
10/30
Multi-Media Information Lab, NTHU
Grouped Classes

Result
4
25
3.5
C3
20
C1
3
15
C2
2.5
10
C2
C1
5
2
1.5
0
C3
-5
1
-10
0.5
-10
11/30
-5
0
5
10
15
20
25
30
35
0
-40
-35
-30
-25
Multi-Media Information Lab, NTHU
-20
-15
-10
-5
0
5
Reweighting function

Take a look at this case
We got a problem here
4
30
25
3.5
20
3
15
2.5
10
2
C1
C1
5
C2
1.5
C3
0
1
-5
0.5
-10
0
-10
0
10
20
30
40
-40
-35
-30
-25
-20
However, the proper projection is …
12/30
Multi-Media Information Lab, NTHU
-15
-10
-5
0
5
Reweighting function

Solved by reweighting



Every groups in the same class are weighted by the distance
from the mean of the class
Farer groups have the larger weights
The modified criteria becomes …
min J a 
subject to a T a  1


J a   a   wsi , s j , sk rwi, j , p, q M ij , pq  rwi, k , r , s M ik ,rs  a
i , j , k , p , q , r , s

T
M ij , pq  mi , p  m j ,q mi , p  m j ,q 
T
rwi, j , p, q   mi , p  mi  m j ,q  m j
2
13/30
2
Multi-Media Information Lab, NTHU
Reweighting function
35
4
30
3
25
2
20
1
15
0
C1
C1
C2
10
C3
-40
-35
-30
-25
-20
-15
-10
-5
0
5
4
5
3
0
2
C1
C2
C3
1
-5
-10
-10
0
14/30
10
20
30
40
0
-25
-20
-15
Multi-Media Information Lab, NTHU
-10
-5
0
5
Non-linear Ranking Projection

It is impossible to find a linear projection that have the
order (c3, c2, c1, c4)
6
C1
4
C2
2
0
-2
C3
-4
C4
-6
-8
15/30
-6
-4
-2
0
2
4
6
8
Multi-Media Information Lab, NTHU
General Idea of Kernel


Transform the data into the high
dimensional space through   , and do
the ranking projection on this space
The projection algorithm can be done
by using the dot product, i.e.  t  



Hence, we can define the term
t
k  x, y     x    y 
k x, y is called the Gram matrix (the
discussion of the validation of the kernel
is skip here)
Several kernels:



16/30
Polynomial kernel
Gaussian kernel
Radius base kernel … etc.
Multi-Media Information Lab, NTHU
Non-linear Ranking Projection


Using “kernelized” approach to find a non-linear
projection
Consider the criteria of basic linear case
min J a 
subject to a T a  1
J a   a T M 12  M 13   M 12  M 14   M 13  M 14   M 23  M 24 a
M ij  mi  m j mi  m j 
T

Similar to the kernelized LDA (KDA), we can let the
projection vector be a  N    x 

mi 
17/30
i 1
1
Ni
i
i
  x 
Ni
i 1
i
Multi-Media Information Lab, NTHU
Non-linear Ranking Projection

Then



a mi   j 1 j x j 
k 1 xk 

 1

 1
N
N
N
t
  j 1 j  k i 1 x j   xk    j 1 j 
 Ni

 Ni
t
N
t
 1

 Ni
Ni



k
x
,
x
k 1 j k 

Ni
  j 1 j i , j
N

Thus
a t M 12a  a t m1  m2 m1  m2  a
t
  1   2 1   2     tU12
t
18/30
t
Multi-Media Information Lab, NTHU
Non-linear Ranking Projection

The kernelized criteria becomes
min J  
subject to  T   1
J     T U12  U13   U12  U14   U13  U14   U 23  U 24 
U ij  i   j i   j 
T


Extending to ordinal weighting and grouped class is
straightforward.
Extending to re-weighting is more delicate.
19/30
Multi-Media Information Lab, NTHU
6
Results

C1
4
C2
2
Experiments 1
0
-2
Order: c3, c1, c4, c2
C3
-4
C4
-6
5
Polynomial
kernel,
degree=2
-8
-6
-4
-2
0
2
4
6
8
4.5
C3
4
3.5
C1
3
2.5
C4
2
1.5
C2
1
0.5
0
-3
-2.5
-2
-1.5
-1
-0.5
0
5
4
x 10
4.5
Polynomial
kernel,
degree=3
C3
4
3.5
C1
3
2.5
C4
2
1.5
C2
1
20/30
Multi-Media Information Lab, NTHU
0.5
0
-6
-4
-2
0
2
4
Results
6
C1
4
C2
2
Order: c3, c1, c4, c2
0
-2
C3
-4
C4
-6
-8
-6
-4
-2
0
2
4
6
8
5
4.5
C3
4
3.5
Gaussian
kernel
C1
3
2.5
C4
2
1.5
C2
1
0.5
0
21/30
-4
-3
-2
-1
0
Multi-Media Information Lab, NTHU
1
2
25
Results

C3
20
15
Experiments 2
C2
10
5
C1
0
Order: c3, c2, c1 -5
C1
-10
-5
0
5
10
15
20
25
30
35
40
4
3.5
C3
Polynomial
kernel,
degree=2
3
2.5
C2
2
1.5
C1
C1
1
0.5
0
1
2
3
4
5
6
7
4
x 10
3.5
C3
3
Gaussian
kernel
2.5
C2
2
1.5
C1
1
22/30
0.5
0
Multi-Media Information Lab, NTHU
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
10
8
Results

6
4
2
Experiments 3
Order: c3, c2, c1
C3
0
C2
-2
-4
C1
-6
-8
-10
-10
4
-5
0
5
10
15
3.5
C3
3
Polynomial
kernel,
degree=2
2.5
C2
2
1.5
C1
1
0.5
0
-3
-2.5
-2
-1.5
-1
-0.5
x
4
3.5
C3
3
2.5
Gaussian
kernel
C2
2
1.5
C1
1
0.5
23/30
0
Multi-Media Information Lab, NTHU
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
5
10
10
5
Results

C3
0
Experiments 4
C2
-5
C1
Order: c3, c2, c1-10
-15
-20
4
-20
-15
-10
-5
0
5
10
15
20
3.5
Polynomial
kernel,
degree=2
C3
3
2.5
C2
2
1.5
C1
1
0.5
0
-5
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
6
x 10
4
3.5
C3
Gaussian
kernel
3
2.5
C2
2
1.5
C1
1
0.5
24/30
0
Multi-Media Information Lab, NTHU
-4
-3
-2
-1
0
1
2
Results

Airplane dataset



214 data points
Feature dimension is 13
Scores: 1 to 7 60
50
40
30
20
10
0
25/30
1
2
3
4
5
6
Multi-Media Information Lab, NTHU
7
Results

Linear ranking projection
7
6
5
4
3
2
1
-500
26/30
0
500
1000
1500
2000
2500
3000
Multi-Media Information Lab, NTHU
7
Results
Polynomial
kernel,
degree=2
6
5
4
3
2
1
-1
7
0
1
2
3
4
5
6
14
x 10
6
Polynomial
kernel,
degree=5
5
4
3
2
1
-2
-1
0
1
2
3
4
5
6
7
8
35
x 10
7
6
Polynomial
kernel,
degree=10
5
4
3
2
1
-2
27/30
-1
0
1
2
3
Multi-Media Information Lab, NTHU
4
5
71
x 10
Results


Each data points are projected onto the same points due
to the computer precision
Preserve the order well
7
6
5
Gaussian
kernel
4
3
2
1
-0.1
28/30
-0.05
0
0.05
0.1
0.15
0.2
0.25
Multi-Media Information Lab, NTHU
0.3
0.35
Future Work

Some works need to be done

For grouped classes  Time consuming




We can use “kernelized” K-means clustering to reduce the size of the
data points
The re-weighting function in the high dimensional space (kernel
approach) has not done yet
The precision problem in the kernelized approach
Potential work



Derives a probabilistic model?
How to cope with the “missing” data (i.e. some dimensions of
features are missing)?
For what kernel is appropriate?
29/30
Multi-Media Information Lab, NTHU
Questions?
30/30
Multi-Media Information Lab, NTHU
Download