Global and Efficient Self-Similarity for Object Classification and Detection Thomas Deselaers and Vittorio Ferrari CALVIN group Computer Vision Laboratory ETH Zurich Switzerland CVPR 2010 Conventional Image Descriptors Measure direct image properties gradients colors 2 Self-Similarity vs Conventional Descriptors Assumption of conventional image descriptors • There is a direct visual property shared by images of objects of the same class (e.g. colors, gradients, …). • This property can be used to compare images. Self-similarity: • Indirect property: geometric layout of repeating patches within an image • More general property [Shechtman, Irani CVPR 07] 3 Local Self-Similarity Descriptors [Shechtman, Irani CVPR 07] 4 Using Local Self-Similarity Descriptors Applications: object recognition, image retrieval, action recognition • Ensemble matching [Shechtman CVPR 07] • Nearest neighbor matching [Boiman CVPR 08] • Bag of local self-similarities [Gehler ICCV09, Vedaldi ICCV09, Hörster ACMM08, Lampert CVPR09, Chatfield ICCV09 WS] 1. Compute LSS descriptors for an image 2. Assign the LSS descriptors to a codebook 3. Represent the image as a histogram of LSS descriptors 5 Self-Similarity goes Global Capture long-range self-similarities and their spatial arrangement 6 Self-Similarity goes Global Capture long-range self-similarities and their spatial arrangement 7 Global Self-Similarity Tensor compute self-similarity between all pairs of pixels 4D self-similarity tensor Note: local self-similarities included 8 Problems with the GSS Tensor 11 11 300 500 • Computation time: • Memory requirement: ∼ 20h ∼ 80GB Aim: Reduce both 9 Outline • Efficient global self-similarity tensor • Global self-similarity descriptors – Bag of correlation surfaces – Self-similarity hypercubes • Detection with self-similarity hypercubes – Efficient sliding window – Efficient subwindow search • Experiments – Global self-similarity better than local self-similarity – Complementary to conventional descriptors – Object detection possible 10 Efficient Global Self-Similarity Tensor Find an efficient approximation Quantize patches to according to codebook If two patches are assigned to the same prototype, they are similar Reduces runtime to speedup: 750 11 Efficient Global Self-Similarity Two patches are only similar if they are assigned to the same prototype Reduces memory to reduction: 12 Patch Prototype Codebooks Remember: Self-similarity encodes image content indirectly Image-specific codebooks can be smaller than conventional ones see paper for more generic codebooks and extensive evaluation 13 Global Self-Similarity Descriptors So far: • Compact GSS computed efficiently Now: • Descriptors that can be used in machine learning classifiers • Fixed dimensionality • Compact representation • Self-similarity hypercubes: now • Bag of correlation surfaces: only in the paper 14 Self-Similarity Hybercubes SSH of size 15 SSHs for Detection • Computing SSH naïvely requires operations • Sliding windows has to evaluate many windows operations 16 Efficient Computation of SSHs Compute integral self-similarity tensor: can be obtained using 16 lookups in 160000 operations to compute SSH for an image window ∼5000x speedup 17 Efficient Subwindow Search for SSH • Derive an upper bound on the score of a set of windows • Section 5.2 in our paper • Similar to [Lampert PAMI09] 18 Experiments: Object classification PASCAL 07 objects – 9608 cropped images of objects from PASCAL 07 – 20 classes Task: Classify each test image into one of 20 classes Model: Linear SVM Train: train+val Test: test 19 classification accuracy [%] Classification on the PASCAL 07 objects set + GSS outperform LSS + Self-Similarity is truly complementary to conventional descriptors 20 Experiments: Object detection ETHZ Shape Classes – 255 images – 5 classes (apple logos, bottles, giraffes, mugs, swans) Task: Detect objects in images Detector: Linear SVM, sliding windows e.g. [Ferrari CVPR07, Maji CVPR09] 21 Detection Results DR at 0.5 PASCAL overlap bottles giraffes swans mug s FPPI 0.4 apple logos DR at FPPI 0.4 } } SSH BoLSS BoLSS SSH apple logos 10.0 80.0 bottles 10.7 96.4 giraffes 23.4 85.1 mugs 6.5 67.7 swans 17.6 70.6 Average 13.6 80.0 Comparison results (avg): [Ferrari CVPR07]: 71.9 [Maji CVPR09]: 93.2 … many more + SSH outperforms BOLSS + it is possible to use GSS for detection with good results 22 Runtimes for Computing Descriptors • 200x200 image • GSS tensor – directly: 5512s (∼1.5 hours) – using our method: 81s (∼1.5 minutes) • Computing descriptors: few seconds • Our method: 70x speedup • For Reference: – GIST: 0.4s – BOLSS: 0.7s 23 Runtimes for Detection Given the prototype assignment map (80s) (once only) SSH sliding window: 30s/img (once per class) For Comparison – Computing direct GSS tensor for 25000 windows: 4 years/img Speedup: ∼1 million ⇒ Using our methods, GSS can be used for object detection For Reference: – Felzenszwalb PAMI 09: 5s. 24 Global and Feasible Efficient Self-Similarity for Object Classification and Detection Thomas Deselaers and Vittorio Ferrari CALVIN group Computer Vision Laboratory ETH Zurich Switzerland CVPR 2010 Conclusion • self-similarity should be considered globally – Global self-similarity performs better than local self-similarity • truly complementary to conventional descriptors • global self-similarity is feasible – efficient computation of self-similarity – two descriptors based on self-similarity • global self-similarity for detection • code will be available soon 26 Thank you for your attention Thomas Deselaers and Vittorio Ferrari Global and Efficient Self-Similarity for Object Classification and Detection Code will be available http://www.vision.ee.ethz.ch/~calvin Thank you for your attention Thomas Deselaers and Vittorio Ferrari Global and Efficient Self-Similarity for Object Classification and Detection Code will be available http://www.vision.ee.ethz.ch/~calvin