Relevance Aggregation Projections for Image Retrieval CIVR 2008

advertisement
Relevance Aggregation
Projections for Image Retrieval
CIVR 2008
Wei Liu
Wei Jiang Shih-Fu Chang
wliu@ee.columbia.edu
Syllabus
z Motivations and Formulation
z Our Approach: Relevance Aggregation Projections
z Experimental Results
z Conclusions
Liu et al.
Columbia University
2/28
Syllabus
z Motivations and Formulation
z Our Approach: Relevance Aggregation Projections
z Experimental Results
z Conclusions
Liu et al.
Columbia University
3/28
Motivations and Formulation
z Relevance feedback
‡ to close the semantic gap.
‡ to explore knowledge about the user’s intention.
‡ to select features, refine models.
z Relevance feedback mechanism
‡ User selects a query image.
‡ The system presents highest ranked images to user, except for
labeled ones.
‡ During each iteration, the user marks “relevant” (positive)
and ”irrelevant” (negative) images.
‡ The system gradually refines retrieval results.
Liu et al.
Columbia University
4/28
Problems
z Small sample learning – Number of labeled images is
extremely small.
z High dimensionality – Feature dim >100, labeled data
number < 100.
z Asymmetry – relevant data are coherent and irrelevant
data are diverse.
Liu et al.
Columbia University
5/28
Asymmetry in CBIR
query
relevant images
irrelevant images
Liu et al.
Columbia University
6/28
Possible Solutions
z Asymmetry:
T
query
query
margin =1
margin =1
z Small sample learning Æ semi-supervised learning
z Curse of dimensionality Æ dimensionality reduction
Liu et al.
Columbia University
7/28
Previous Work
Methods
LPP
NIPS’03
labeled
unlabeled
√
SSP
SR
ACM MM’05 ACM MM’06 ACM MM’07
√
√
√
√
√
√
l-1
2
√
asymmetry
dimension
bound
ARE
d
l-1
image dim: d, total sample #: n, labeled sample #: l
In CBIR, n > d > l
Liu et al.
Columbia University
8/28
Disadvantages
z LPP: unsupervised.
z SSP and SR: fail to engage the asymmetry.
‡ SSP emphasizes the irrelevant set.
‡ SR treats relevant and irrelevant sets equally.
z ARE, SSP and SR: produce very low-dimensional
subspaces (at most l-1 dimensions). Especially for SR
(2D subspace).
Liu et al.
Columbia University
9/28
Syllabus
z Motivations and Formulation
z Relevance Aggregation Projections (RAP)
z Experimental Results
z Conclusions
Liu et al.
Columbia University
10/28
Symbols
n : total #, l : labeled #
d : original dim, r : reduced dim
X = [ x1 ,..., xl , xl +1 ,..., xn ] ∈ \ d ×n : samples
X l = [ x1 ,..., xl ] ∈ \ d ×l : labeled samples
F + : relevant set, F − : irrelevant set
l + : relevant #, l − : irrelevant #
A ∈ \ d ×r : subspace, a ∈ \ d : projecting vector
G (V , E , W ) : graph, L = D − W : graph Laplacian
Liu et al.
Columbia University
11/28
Graph Construction
z Build a k-NN graph as
2
⎧
xi − x j
k
k
⎪exp(−
∈
∨
∈
),
x
N
(
x
)
x
N
( xi )
i
j
j
2
Wij = ⎨
σ
⎪
0,
otherwise
⎩
z Establish an edge if xi is among k-NNs of x j or x j is
among k-NNs of xi .
z Graph Laplacian L = D − W ∈ \ n×n : used in smoothness
regularizers.
Liu et al.
Columbia University
12/28
Our Approach
T
T
min
tr
(
A
XLX
A)
d ×r
(1.1)
A∈\
s.t. AT xi =
∑
j∈F +
AT x j / l + , ∀i ∈ F +
(1.2)
2
AT ( xi −
∑
j∈F +
x j / l + ) ≥ r , ∀i ∈ F − (1.3)
Target – subspace A reducing raw data from d dims to r dims
Obj (1.1) – minimize local scatter using labeled and unlabeled data
Cons (1.2) – aggregate positive data (in F+ ) to the positive center
Cons (1.3) – push negative data (in F-) far away from the positive
center with at least r unit distances.
Cons (1.2) (1.3) just address asymmetry in CBIR.
Liu et al.
Columbia University
13/28
Core Idea: Relevance Aggregation
z An ideal subspace is one in which the relevant examples are
aggregated into a single point and the irrelevant examples are
simultaneously separated by a large margin.
Liu et al.
Columbia University
14/28
Relevance Aggregation Projections
z We transform eq. (1) to eq. (2) in terms of each column
vector a in A (a is a projecting vector):
mind aT XLX T a
(2.1)
s.t. aT xi = aT c + , ∀i ∈ F +
(2.2)
a∈\
+
c
=
where
Liu et al.
∑
j∈F +
+
2
a ( xi − c ) ≥ 1, ∀i ∈ F − (2.3)
T
x j / l + is the positive center.
Columbia University
15/28
Solution
z Eq. (2.1-2.3) is a quadratically constrained quadratic
optimization problem and thus hard to solve directly.
z We want to remove constraints first and minimize the
cost function then.
z We adopt a heuristic trick to explore the solution.
‡ Find ideal 1D projections which satisfy the constraints.
‡ Removing constraints, solve a part of the solution.
‡ Solve another part of the solution.
Liu et al.
Columbia University
16/28
Solution: Find Ideal Projections
z Run PCA to get the r principle eigenvectors and renormalize
them to get V = [v1 ,..., vr ] ∈ \ d ×r such that V T XX T V = I .
z On each vector v in V, vT xi − vT x j < 2, i, j = 1,..., n.
z Form the ideal 1D projections on each projecting direction v
⎧
vT c + , i ∈ F +
⎪
T
−
T
T +
v
x
,
i
F
v
x
v
c ≥1
∈
∧
−
⎪
i
i
yi = ⎨
−
T +
T
T +
⎪ v c + 1, i ∈ F ∧ 0 ≤ v xi − v c < 1
⎪vT c + − 1, i ∈ F − ∧ −1 < vT x − vT c + < 0
i
⎩
y = [ y1 ,..., yl ]T ∈ \ l
Liu et al.
Columbia University
(3)
17/28
Solution: Find Ideal Projections
vT X l ∈ \1×l
yT ∈ \1×l
vT xi − vT c + > 1 vT xi − vT c + ≤ 1
vT c +
yi − vT c + > 1
yi − vT c + = 1
The vector y is formed according to each PCA vector v.
Liu et al.
Columbia University
18/28
Solution: QR Factorization
z Remove constraints eq. (2.2-2.3) via solving a linear system
X lT a = y
(4)
z Because l < d , eq. (4) is underdetermined and thus strictly
satisfied.
⎡R⎤
z Perform QR factorization: X l = [Q1 Q2 ] ⎢ ⎥ = Q1 R
⎣0⎦
z The optimal solution is a sum of a particular solution and a
complementary solution, i.e.
where b1 = ( RT ) −1 y
Liu et al.
a = Q1b1 + Q2b2
Columbia University
(5)
19/28
Solution: Regularization
z We hope that the final solution will not deviate the PCA
solution too much, so we develop a regularization framework.
z Our framework is
f (a ) = a − v + γ aT XLX T a
2
(6)
γ > 0 controls the trade-off between PCA solution and data
locality preserving (original loss function). The second term
behaves as a regularization term.
z Plugging a = Q1b1 + Q2b2 into eq. (6), we solve
b2 = ( I + γ Q2T XLX T Q2 )-1 (Q2T v − γ Q2T XLX T Q1b1 )
Liu et al.
Columbia University
20/28
Algorithm
① Construct a k-NN graph
W , L, S = XLX T
② PCA initialization
V = [v1 ,..., vr ]
③ QR factorization
Q1 , Q2 , R
for j = 1: r
form y with v j
④ Transductive
Regularization
⑤ Projecting
b1 = ( RT ) −1 y
b2 = ( I + γ Q2T SQ2 )-1
[a1 ,..., ar ]T x
(Q2T v j − γ Q2T SQ1b1 )
a j = Q1b1 + Q2b2
end
Liu et al.
Columbia University
21/28
Syllabus
z Motivations and Formulation
z Our Approach: Relevance Aggregation Projections
z Experimental Results
z Conclusions
Liu et al.
Columbia University
22/28
Experimental Setup
z Corel image database: 10,000 image, 100 image per
category.
z Features: two types of color features and two types of
texture features, 91 dims.
z Five feedback iterations, label top-10 ranked images in
each iteration.
z The statistical average top-N precision is used for
performance evaluation.
Liu et al.
Columbia University
23/28
Evaluation
Liu et al.
Columbia University
24/28
Evaluation
Liu et al.
Columbia University
25/28
Syllabus
z Motivations and Formulation
z Our Approach: Relevance Aggregation Projections
z Experimental Results
z Conclusions
Liu et al.
Columbia University
26/28
Conclusions
z We develop RAP to simultaneously solve three fundamental
issues in relevance feedback:
‡ asymmetry between classes
‡ small sample size (incorporate unlabeled samples)
‡ high dimensionality
z RAP learns a semantic subspace in which the relevant
samples collapse while the irrelevant samples are pushed
outward with a large margin.
z RAP can be used to solve imbalanced semi-supervised
learning problems with few labeled data.
z Experiments on COREL demonstrate RAP can achieve a
significantly higher precision than the stat-of-the-arts.
Liu et al.
Columbia University
27/28
Thanks!
http://www.ee.columbia.edu/~wliu/
Liu et al.
Columbia University
28/28
Download