Bag of Features Approach: recent work, using geometric information Problem • Search for object occurrences in very large image collection 2 sub problems • Object Category Recognition and Specific Object Recognition Motivation • Look for product information • Look for similar products Related work on large scale image search • Most systems build upon the BoF framework [Sivic & Zisserman 03] – Large (hierarchical) vocabularies [Nister Stewenius 06] – Improved descriptor representation [Jégou et al 08, Philbin et al 08] – Geometry used in index [Jégou et al 08, Perdoc’h et al 09] – Query expansion [Chum et al 07] –… • Efficiency improved by: – Min-hash and Geometrical min-hash [Chum et al. 07-09] – Compressing the BoF representation [Jégou et al. 09] Local Features - SIFT Creating a visual vocabulary 1 2 3 4 Inverted Index Index construction Searching Use geometry • Possible directions: – Change/optimize spatial verification stage – Insert a new geometric information to the index • Ordered BOF • Bundled features • Visual phrases – Change the searching algorithm Survey for today • Spatial Bag-of-features [Cao, CVPR2010] • Image Retrieval with Geometry-Preserving Visual Phrases [Zhang Jia Chen, CVPR2011] • Smooth Object Retrieval using a Bag of Boundaries [Arandjelovi Zisserman, ICCV2011] Spatial BOF • Basic idea: Spatial BOF • Constructing linear and circular ordered bagof-features: Spatial BOF • Translation invariance: Spatial BOF • Pros: – Gets better performance than BOF+RANSAC for large scale dataset* – Same format as standard BOF • Cons: – Is dataset dependent because of need of training • Do not present the results for large scale dataset with transfer learning from another dataset • Future work – Check it with cross training for large dataset. Otherwise, it is not worth working further. Geometry-Preserving Visual Phrases • Basic idea: Geometry-Preserving Visual Phrases • Representation – Quantize image to 10x10 grid – Histogram of GVPs of length k – GVP dictionary size is “choose k from N visual words” Geometry-Preserving Visual Phrases • Pros: – Outperforms BOV + RANSAC • Cons: – Only translation invariant because of memory • Future work BOF for smooth objects Idea: Segment Query object Gradient The information used for retrieval BOF for smooth objects Results: BOF for smooth objects Segmentation phase • Over segmentation with super-pixels • Classification of super-pixels: • 3208 feature vector (median(Mag(Grad)), 4 bits, color histogram, BOF) • SVM • Post-processing BOF for smooth objects Boundary description phase: • Sample points on the boundary • Calculate HoG at each point in 3 scales 340 dimensional L2 normalized vector * The descriptor is not rotation invariant BOF for smooth objects Retrieval procedure: • Boundary descripors are quantized (k=10k) • Standard BOF scheme* • Spatial verification for top 200 with loose affine homography (errors up to 100pixs) * No spatial information is recorded in the histogram BOF for smooth objects • Pros: – Solves the smooth object retrieval problem – Fast • Cons: – Is dataset dependent because of need of training – Limited to objects with “solid” materials – segmentation has to catch the object’s boundary • Future work – Eliminate the training step Summary • There is an active research in the field of CBIR to exploit geometry information. • Each method with its limitations • Still no widely accepted solution – Like spatial verification with RANSAC