Contextual Image Search

advertisement
Contextual Image Search
Wenhao Lu , Jingdong Wang , Xian-Sheng Hua, Shengjin Wang , Shipeng Li
Tsinghua University, Beijing, P. R. China,
Microsoft Research Asia, Beijing, P. R. China,
MM 2011
Outline
System overview
 Database construction
 Contextual image search with text/image input
 Experiment
 Future Work

2
MM 2011
System overview

Text input
3
MM 2011
System overview

Image input
4
MM 2011
Database construction
5
MM 2011
Database construction
1. Feature extraction (MSER)
extracts stable regions from the image by considering
the change in area w.r.t the change in intensity of a
connected component defined
6
MM 2011
Database construction
2. SIFT descriptor
7
MM 2011
Database construction
2. SIFT descriptor
8
MM 2011
Contextual Image Search With
Text Input
1. Context Capturing

textual contexts: page title / document title
local context

visual contexts: vision-based page segmentation algorithm
(VIPS)
9
MM 2011
vision-based page segmentation
MM 2011
Traditional DOM tree
10
vision-based page segmentation
11
MM 2011
VIPS
vision-based page segmentation

DOM tree +Visual Info
Tag cue: <HR>
Color cue: background color
Text cue
Size cue
12
MM 2011
Contextual Image Search With
Text Input
2. Contextual Query Augmentation


Goal: remove possible ambiguities
Augmented query = query + textual context
Candidate augmented query
MM 2011
evaluate the relevance between
the context and augmented query (Okapi BM25)
13
Contextual Image Search With
Text Input
2. Contextual Query Augmentation

Okapi BM25
: extended context (using synonyms, stemming, and so on)
~
k=2.0, b=0.75
14
MM 2011
Contextual Image Search With
Text Input
2. Contextual Query Augmentation
3. Image Search by Text
Rank score =
: static score (ex. the Web page holding this image)
15
Contextual Reranking

textually contextual reranking
,

: discarding the augmented query related
words
visually contextual reranking
1. Filter out images whose semantic contents may not
be relevant to the query.
(compute local textual context and query)
16
MM 2011
Contextual Reranking

visually contextual reranking
2. Visual word weight:
Find common pattern
3. Compute similarity
:visual contexts
: an image
: histogram vector of i
MM 2011
: histogram vector of k
17
Overall Ranking
= 0.2
= 0.2
=1
18
MM 2011
Contextual Image Search with
Image Input
3
1. Search to annotation

discovers the candidate textual queries using the technique
“Annotating images by mining search result” (IEEE 2008)
19
MM 2011
Contextual Image Search with
Image Input
3
1. Search to annotation
20
MM 2011
Contextual Image Search with
Image Input
3
1. Search to annotation

First : find similar image
Second: surrounding texts of the obtained duplicated images
are mined to get a list of candidate textual queries

visual features
semantic features
MM 2011
Contextual Image Search with
Image Input
1. Search to annotation
22
MM 2011
Contextual Image Search with
Image Input
2. Contextual query identification

calculate
~
23
MM 2011
Experiment

15,000,000 images and associated web pages

5 users (level 0~level 3)
24
MM 2011
Experiment
0.95
0.65
nDCG curves
MM 2011
25
Experiment

Visual Result for Text Input
26
MM 2011
Experiment

Visual Result for Text Input (Textual Reranking)
27
MM 2011
Experiment

Visual Result for Text Input (Visual Reranking)
28
MM 2011
Experiment

Visual Result for Image Input
textual query “Van gogh”
29
MM 2011
Future Work
1. More general contextual image search, including
mobile image search with wider contexts (e.g.,
position, time, and history)
2. Extend contextual image search to contextual
video search by applying the proposed methodology
and investigating extra video contexts
30
MM 2011
Download