Image Retrieval with Geometry-Preserving

Image Retrieval with Geometry-Preserving Visual Phrases Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Similar Image Retrieval … Image Database Ranked relevant images Bag-of-Visual-Word (BoW) Length: dictionary size … Images are represented as the histogram of words Similarity of two images: cosine similarity of histograms Geometry-preserving Visual Phrases length-k Phrase:: k words in a certain spatial layout Bag of Phrases: … … (length-2 phrases) Phrases vs. Words Irrelevant Word Length-2 Length-3 Relevant Word Length-2 Length-3 Previous Works Geometry Verification Only on top ranked images Searching Step with BoW … Encode Spatial Info Post-processing (Geometry Verification) Modeling relationship between words … Previous works: reduce …the number of phrases (length-2 Phrase) Dimension: exponential to # of words in Phrase Co-occurrences in Entire image [L. Torresani, et al, CVPR 2009] No spatial information Phrases a local neighborhoods Yuan et al, CVPR07][Z. Wu et al., CVPR10] Ourin work: All phrases, [J.Linear computation time [C.L.Zitnick, Tech.Report 07] No long range interactions, weak geometry Select a subset of phrases [J. Yuan et al, CVPR07] Discard a large portion of phrases Approach Overview 1. Similarity Measure BoW [Zhang and Chen, 09] 2. Large Scale Retrieval Inverted Files Min-hash BoP This Paper Inverted Files Min-hash Co-occurring Phrases Only consider the translation difference AB D C F E F A AB C D A F E F [Zhang and Chen, 09] Co-occurring Phrase Algorithm y  y' y AB D C F E F A AB C D A E F F 3 2 1 EF 0 -1 F -2 -3 -4 A F F B A C A -2 -1 DF # of co-occurring length -2 Phrases:  2 1 +1    =5 3 A 0 1 2 x  x' x 3 4 Offset space [Zhang and Chen, 09] Relation with the feature vector … …  (x) k … … O(M ) sameask (y) BOW!!! kO(|  |k | X k|k 1| Y |k 1)   ( x),  ( y)  # of co-occurring length-k phrases Inner product of the feature vectors M: # of corresponding pairs, in practice, linear to the number of local features Inverted Index with BoW Avoid comparing with every image Inverted Index … … … … … Image ID Score I1 I2 +1 Score table … In Inverted Index with Word Location I1 … … … … Assume same word only occurs once in the same image,  Same memory usage as BoW … … … Score Table Compute # of Co-occurring Phrases: Compute the Offset Space Image ID BoW … I2 In Score I1 BoP I1 I2 In … Inverted Files with Phrases Inverted Index … -1,-1 0,-1 1,-1 -1,0 0,0 1,0 … 0,1 wi I1 I10 … I8 … I5 … … … +1 Offset Space +1 … +1 … +1 … … Final Score 5 Offset Space I2 I1 In 2 4 8 2 2 1 Image ID … 3 I1 2 10 1 I2 … Score Final similarity scores In Overview BoP BoW Less storage and time complexity Inverted Files Min-hash Inverted Files Min-hash Min-hash with BoW m fi I I’ Probability of min-hash collision (same word) = Image Similarity Min-hash with Phrases m fi m fj I I’ y  y ' y 3 2 1 0 -1 -2 -3 -4 -3 -2 -1 0 1 2 x  x' x Offset space Probability of k min-hash collision with consistent geometry (Details are in the paper) Other Invariances Add dimension to the offset space Increase the memory usage Image I x p1 yˆ  y  y ' y p3 s s' p2 Image I’ x' y' log(sˆ) xˆ  x  x' s s' [Zhang and Chen, 10] Variant Matching Local histogram matching Evaluation 1. BoW + Inverted Index vs. BoP + inverted Index 2. BoW + Min-hash vs. BoP + Min-hash Post-processing methods: complimentary to our work Experiments –Inverted Index  5K Oxford dataset (55 queries)  1M flicker distracters Philbin, J. et al. 07 Example Precision-recall curve BoW BoP Precision Precision BoW BoP Recall Higher precision at lower recall Recall Comparison Mean average precision: mean of the AP on 55 queries mAP BoP+RANSAC 0.700 0.650 0.600 BoP 0.550 BoW 0.500 BoW+RANSAC 0.450 0 200 400 600 Vocabulary Size (K) 800 1000 Outperform BoW (similar computation) Outperform BoW+RANSAC (10 times slower on 150 top images) Larger improvement on smaller vocabulary size +Flicker 1M Dataset 0.65 0.6 0.55 0.5 0.45 0.4 mAP 0 BoW BoP 200 400 600 800 Number of Images 1000 Computational Complexity Method Memory BoW BoP BoW+RANSAC 8.1G 8.5G - Runtime (seconds) Quantization Search 0.137s 0.89s 0.215s 4.137s 0.89s RANSAC: 4s on top 300 images Experiment - min-hash University of Kentucky dataset 3.30 3.20 3.10 3.00 BoW 2.90 BoP 2.80 200 500 800 # of min-hash fun. Minhash with BoW: [O. Chum et al., BMVC08] Conclusion  Encode more spatial information into the BoW  Can be applied to all images in the database at the searching step  Same computational complexity as BoW  Better Retrieval Precision than BoW+RANSAC

Image Retrieval with Geometry-Preserving

Related documents

Products

Support

Image Retrieval with Geometry-Preserving

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib