Part 1: Classical Image Classification Methods Kai Yu Andrew Ng Dept. of Media Analytics Computer Science Dept. NEC Laboratories America Stanford University 1 Outline of Part 2 •Local Features, Sampling, Visual Words •Discriminative Methods - Bag-of-Words (BoW) representation - Spatial pyramid matching (SPM) •Generative Methods - Part-based methods - Topic models 4/8/2015 2 Outline of Part 2 •Local Features, Sampling, Visual Words •Discriminative Methods - Bag-of-Words (BoW) representation - Spatial pyramid matching (SPM) •Generative Methods - Part-based methods - Topic models 4/8/2015 3 Local features • Distinctive descriptors of local image patches • Invariant to local translation, scale, … • and sometimes rotation or general affine transformations • The most famous choice is the SIFT feature 4/8/2015 4 Sampling local features from images A set of points 4/8/2015Image credits: F-F. Li, E. Nowak, J. Sivic 5 Visual words • Similar points are grouped into one visual word • Algorithms: k-means, agglomerative clustering, … • Points from different images are then more easily compared. 4/8/2015 Slide credit: Kristen Grauman 6 Outline of Part 2 •Local Features, Sampling, Visual Words, … •Discriminative Methods - Bag-of-Words (BoW) representation - Spatial pyramid matching (SPM) •Generative Methods - Part-based methods - Topic models 4/8/2015 7 Bag-of-words (BoW) representation Analogy to documents Adapted from tutorial slides by Fei-Fei et al. 4/8/2015 8 BoW for object categorization • Works pretty well for whole-image classification Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005) 4/8/2015Slide credit: Svetlana Lazebnik 9 Unsupervised Dictionary Learning SIFT space R1 R2 R3 image database • Sample local features from images • Run k-mean or other clustering algorithm to get dictionary • Dictionary is also called “codebook” 4/8/2015 10 Compute BoW histogram for each image Assign sift features into clusters R1 R1 R2 R2 Compute the frequency of each cluster within an image 4/8/2015 R3 R3 BoW histogram representations 11 Indication of BoW histogram • Summarize entire image based on its distribution of visual word occurrences • Turn bags of different sizes into a fixed length vector • Analogous to bag of words representation commonly used for text categorization. 4/8/2015 12 Image classification based on BoW histogram BoW histogram vector space bird Decision boundary dog • Learn a classification model to determine the decision boundary • Nonlinear SVMs are commonly applied. 4/8/2015 13 Issues • Sampling strategy • Learning codebook: size? supervised?, … • Classification: which method? scalability? • Scalability: how to handle millions of data? • How to use spatial information? 4/8/2015 14 Spatial information • The BoW removes spatial layout. • This increases the invariance to scale, translation, and deformation, • But sacrifices discriminative power, especially when the spatial layout is important. 4/8/2015 Slide adapted from Bill Freeman 15 Spatial pyramid matching • Compute BoW for image regions at different locations in various scales 4/8/2015Figure credit: Svetlana Lazebnik 16 A common pipeline for discriminative image classification using BoW Dictionary Learning Image Classification Dense/Sparse SIFT Dense/Sparse SIFT K-means VQ Coding Spatial Pyramid Pooling dictionary Nonlinear SVM 4/8/2015 17 Combining multiple descriptors Multiple Feature Detectors Multiple Descriptors: SIFT, shape, color, … VQ Coding and Spatial Pooling Nonlinear SVM Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008 4/8/2015 18 Outline of Part 2 •Local Features, Sampling, Visual Words, … •Discriminative Methods - Bag-of-Words (BoW) representation - Spatial pyramid matching (SPM) •Generative Methods - Part-based methods - Topic models 4/8/2015 19 Topic models for images “beach” Latent Dirichlet Allocation (LDA) c D 4/8/2015 z N w Fei-Fei et al. ICCV 2005 Slide credit Fei-Fei Li 20 Part-based Model Rob Fergus ICCV09 Tutorial 4/8/2015 Fischler & Elschlager 1973 21 For a comprehensive coverage of object categorization models, please visit Recognizing and Learning Object Categories Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/shortCourseRLOC/ 4/8/2015 22