State-of-the-art Image Classification Methods

advertisement
Part 1:
Classical Image Classification
Methods
Kai Yu
Andrew Ng
Dept. of Media Analytics Computer Science Dept.
NEC Laboratories America
Stanford University
1
Outline of Part 2
•Local Features, Sampling, Visual Words
•Discriminative Methods
- Bag-of-Words (BoW) representation
- Spatial pyramid matching (SPM)
•Generative Methods
- Part-based methods
- Topic models
4/8/2015
2
Outline of Part 2
•Local Features, Sampling, Visual Words
•Discriminative Methods
- Bag-of-Words (BoW) representation
- Spatial pyramid matching (SPM)
•Generative Methods
- Part-based methods
- Topic models
4/8/2015
3
Local features
• Distinctive descriptors of local image patches
• Invariant to local translation, scale, …
• and sometimes rotation or general affine transformations
• The most famous choice is the SIFT feature
4/8/2015
4
Sampling local features from images
A set of points
4/8/2015Image credits: F-F. Li, E. Nowak, J. Sivic
5
Visual words
• Similar points are grouped into one visual word
• Algorithms: k-means, agglomerative clustering, …
• Points from different images are then more easily compared.
4/8/2015 Slide credit: Kristen Grauman
6
Outline of Part 2
•Local Features, Sampling, Visual Words, …
•Discriminative Methods
- Bag-of-Words (BoW) representation
- Spatial pyramid matching (SPM)
•Generative Methods
- Part-based methods
- Topic models
4/8/2015
7
Bag-of-words (BoW) representation
Analogy to documents
Adapted from tutorial slides by Fei-Fei et al.
4/8/2015
8
BoW for object categorization
• Works pretty well for whole-image classification
Csurka et al. (2004), Willamowski et al.
(2005), Grauman & Darrell (2005),
Sivic et al. (2003, 2005)
4/8/2015Slide credit: Svetlana Lazebnik
9
Unsupervised Dictionary Learning
SIFT
space
R1
R2
R3
image database
• Sample local features from images
• Run k-mean or other clustering algorithm to get dictionary
• Dictionary is also called “codebook”
4/8/2015
10
Compute BoW histogram for each image
Assign sift
features into
clusters
R1
R1
R2
R2
Compute the
frequency of
each cluster
within an image
4/8/2015
R3
R3
BoW histogram representations
11
Indication of BoW histogram
• Summarize entire image based on its
distribution of visual word occurrences
• Turn bags of different sizes into a fixed
length vector
• Analogous to bag of words
representation commonly used for text
categorization.
4/8/2015
12
Image classification based on BoW histogram
BoW histogram vector space
bird
Decision
boundary
dog
• Learn a classification model to determine the decision boundary
• Nonlinear SVMs are commonly applied.
4/8/2015
13
Issues
• Sampling strategy
• Learning codebook: size? supervised?, …
• Classification: which method? scalability?
• Scalability: how to handle millions of data?
• How to use spatial information?
4/8/2015
14
Spatial information
• The BoW removes spatial layout.
• This increases the invariance to scale,
translation, and deformation,
• But sacrifices discriminative power,
especially when the spatial layout is
important.
4/8/2015
Slide adapted from Bill Freeman
15
Spatial pyramid matching
• Compute BoW for image regions at different locations in various scales
4/8/2015Figure credit: Svetlana Lazebnik
16
A common pipeline for discriminative image
classification using BoW
Dictionary Learning
Image Classification
Dense/Sparse SIFT
Dense/Sparse SIFT
K-means
VQ Coding
Spatial Pyramid
Pooling
dictionary
Nonlinear SVM
4/8/2015
17
Combining multiple descriptors
Multiple Feature
Detectors
Multiple Descriptors:
SIFT, shape, color, …
VQ Coding and
Spatial Pooling
Nonlinear SVM
Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
4/8/2015
18
Outline of Part 2
•Local Features, Sampling, Visual Words, …
•Discriminative Methods
- Bag-of-Words (BoW) representation
- Spatial pyramid matching (SPM)
•Generative Methods
- Part-based methods
- Topic models
4/8/2015
19
Topic models for images
“beach”
Latent Dirichlet Allocation (LDA)
c
D
4/8/2015

z
N
w
Fei-Fei et al. ICCV 2005
Slide credit Fei-Fei Li
20
Part-based Model
Rob Fergus ICCV09 Tutorial
4/8/2015
Fischler & Elschlager 1973
21
For a comprehensive coverage of object
categorization models, please visit
Recognizing and Learning
Object Categories
Li Fei-Fei (Stanford), Rob Fergus (NYU),
Antonio Torralba (MIT)
http://people.csail.mit.edu/torralba/shortCourseRLOC/
4/8/2015
22
Download