DESCRIPTORS (DESCRIPTION OF INTEREST REGIONS WITH LOCAL BINARY PATTERNS) Yu-Lin Cheng (03/07/2011) OUTLINE Scale Invariant Feature Transform (SIFT) Descriptor Local Binary Pattern (LBP) Descriptor Center-Symmetric LBP (CS-LBP) Descriptor Histogram of Oriented Gradients (HOG) Descriptor SIFT(S CALE INVARIANT FEATURE TRANSFORM ) SIFT Algorithm: descriptor SIFT(S CALE INVARIANT FEATURE TRANSFORM ) Scale-space Extrema Detection: Stable feature points ----- (scale invariant) Principle: A local maximum over scales by using combination of normalized derivatives can be treated as a characteristic point of local structure Use LoG to find maximum bad scale Good ! scale SIFT(S CALE INVARIANT FEATURE TRANSFORM Scale-space Extrema Detection: Use DoG instead of LoG ---- (computational efficiency) ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Scale-space Extrema Detection: ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Scale-space Extrema Detection: Local extrema detection: Compare to 26 neighbors Keep the same keypoint in all scale ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Scale-space Extrema Detection: Reject points with low contrast ) SIFT(S CALE INVARIANT FEATURE TRANSFORM ) Accurate keypoints localization: Quadratic function to interpolate the location of maximum Eliminate edge response: r: threshold, H: Hessian matrix SIFT(S CALE INVARIANT FEATURE TRANSFORM ) Orientation Assignment: Assign a consistent orientation to achieve orientation invariant Method: SIFT(S CALE INVARIANT FEATURE TRANSFORM ) Orientation Assignment: Calculate gradient magnitude and direction of neighboring pixels SIFT(S CALE INVARIANT FEATURE TRANSFORM Orientation Assignment: Calculate weighted orientation histogram ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Orientation Assignment: Calculate weighted orientation histogram ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Orientation Assignment: Calculate weighted orientation histogram ) SIFT(S CALE INVARIANT FEATURE TRANSFORM Keypoints Descriptor: Empirical result: Cell size: 4×4 pixels Block size: 4×4 cells Dimension: 4×4 (cells) × 8 (bins) = 128 Weighted magnitude ) SIFT(S CALE INVARIANT FEATURE TRANSFORM ) Keypoints Descriptor: Avoid all boundary effect Use trilinear interpolation Normalization: (illumination invariant) Normalize to unit length Threshlod the maximum value to 0.2 Match the magnitudes for large gradients is no longer important Renormalize to unit length LBP(L OCAL BINARY PATTERN) A powerful mean of texture description LBP operator: Standard LBP: Illustration: LBP(L OCAL BINARY PATTERN) Example: Parameters: P : Number of neighboring pixels R : Radius LTP(L OCAL TRINARY PATTERN) LTP operator: t : threshold Illustration: CS-LBP(C CS-LBP operator: Illustration: ENTER-SYMMETRIC LOCAL BINARY PATTERN) CS-LBP DESCRIPTOR Flow diagram: CS-LBP DESCRIPTOR Interest Region Detection: Detectors: 1. Hessian-Affine (blob-like structure) 2. Harris-Affine (corner-like structure) 3. Hessian-Laplace (scale-invariant version) 4. Harris-Laplace (scale-invariant version) 41×41 CS-LBP DESCRIPTOR Feature Extraction: CS-LBP operator: Parameters: R: radius N: number of neighboring pixels N = 6, 8 T: threshold R = 1, 2 T = 0.2 Descriptor Construction: Location grids 3×3 cells/4×4 cells Avoid boundary effects: Using ‘bilinear interpolation’ 41×41 CS-LBP DESCRIPTOR Descriptor Normalization: (illumination invariant) Normalize to unit length Thresholding Renormalize to unit length 24 × 4 × 4 = 256 COMPARISON(SIFT V.S . CS-LBP) Assumption: Computations cannot be reused from detection algorithm Comparison: Conclusion: Computational efficiency and better performance than SIFT HOG(H ISTOGRAM OF ORIENTED GRADIENTS) HOG(H ISTOGRAM OF Gradient Computation: ORIENTED GRADIENTS) HOG(H ISTOGRAM OF Gradient Computation: ORIENTED GRADIENTS) HOG(H ISTOGRAM OF ORIENTED GRADIENTS) Spatial/Orientation Binning: Weighted votes Avoid aliasing Function of magnitude Interpolation Parameters: Number of orientation bins Cell size Block size Cell Block HOG(H ISTOGRAM OF ORIENTED GRADIENTS) Spatial/Orientation Binning: Parameters: Number of orientation bins: 9 bins/18bins Cell size: 8×8 pixels Block size: 2×2 cells HOG(H ISTOGRAM OF ORIENTED GRADIENTS) Normalization: Group cells to larger blocks and normalize each block separately (illumination invariant) Normalization Schemes: HOG(H ISTOGRAM OF Normalization: Normalization Schemes: ORIENTED GRADIENTS) COMPARISON(SIFT Comparison: V.S . HOG) HOG VARIATION ‘Object Detection with Discriminatively Trained Part Based Models’ Pixel-Level Feature Maps: Use [-1, 0, 1] to calculate gradient Contrast sensitive(B1), Contrast insensitive(B2) ,(p = 9) Quantize into orientation bins r: gradient magnitude HOG VARIATION Spatial Aggregation: Rectangular cell: 8×8 pixels Cell-based feature map: Avoid aliasing: Reduce the size of feature map Bilinear interpolation Normalization: HOG VARIATION Truncation: maximum 0.2 No renormalization Dimension: 9 bins × 4 different normalization = 36 (contrast insensitive) HOG VARIATION PCA analysis: Top 11 eigenvectors captures most of information of HOG HOG VARIATION PCA analysis: Top eigenvectors lie (approximately) in a linear subspace 13-dimensional features: Project 36-dimensional HOG feature into uk, vk Projection into uk : sum over 4 normalization over fixed orientation Projection into vk : sum over 9 orientation over fixed normalization HOG VARIATION For Contrast Insensitive(B2): For Contrast Sensitive(B1): 9 bins × 4 different normalization = 36 (contrast insensitive) 18 bins × 4 different normalization = 72 (contrast insensitive) Reduce to (18 + 9) + 4 = 31 dimension REFERENCE “Description of Interest Regions With Local Binary Patterns”, Pattern Regonization ’09 Marko Heikkilä http://www.tele.ucl.ac.be/~devlees/ref_ELEC2885/projects/RoIdescriptionLBPpr-accepted.pdf “Effective Pedestrian Detection Using Center-symmetric Local Binary/Trinary Patterns”, Youngbin Zheng “Scale-space Theory” Tony Lindeberg “Histogram of Oriented Gradients for Human Detection”, CVPR ‘05 Navneet Dalal “Finding People in Images and Videos”, Navneet Dalal “Feature matching” Yung-Yu Chuang “Scale & Affine Invariant Interest Point Detectors”, IJCV ’04 Krystian Mikolajczyk REFERENCE “Object Detection with Discriminatively Trained Part Based Models” “Distinctive Image Features from Scale-Invariant Keypoints”, IJCV ’04 David G. Lowe http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.3843&rep=rep1& type=pdf