c[n] SIFT Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp. 91-110 Presented by: Shalomi Eldar Based (in part) on slides by Ofir Pele, Kirill Dyagilev and Ayelet Dominitz Vision Topics Seminar 2009 Introduction Description Detection Applications Image Matching Fundamental aspect of many problems: Object Recognition 3D structures Stereo Correspondence Motion Tracking 2 Introduction Detection Description Applications Features Detection Give comprehensive description of image. Enables matching! What are the desired features’ features? Images from: M. Brown and D. G. Lowe. Recognising Panoramas. In Proceedings of the the International Conference on Computer Vision (ICCV2003 ) 3 Introduction Detection Description Applications Features’ Features Robustness => Invariance to changes in illumination, scale, rotation, affine, perspective. Locality => robustness to occlusion and clutter. Distinctiveness => easy to match to a large database of objects. Quantity => many features can be generated for even small objects. Efficiency => computationally “cheap”, realtime performance. 4 Introduction Detection Description Applications SIFT Algorithm Input: Image nxm. Output: Set of descriptors of image’s features. SIFT - Scale Invariant Feature Transform 5 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. 3. 4. Typical image of size 500x500 pixels Orientation assignment. produces about 2000 stable keypoints Keypoint descriptor. Near Real-time performance: Cascade approach – keep heavy operations only to keypoints that “survive”. 6 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. 3. Orientation assignment. Keypoint descriptor. 4. => Application - matching. We need only 3 keypoints matches for reliable identification! 7 Introduction Detection Description Applications Today’s Talk Orientation Z Detection Description Application Z Extracting Keypoints Distinctive Description Matching Keypoints Z Z Extrema detection Correct localization Choosing robust keypoints only Local invariant orientation Building keypoint descriptor 1st part of the talk 2nd part of the talk Z Nearest-neighbor algorithm Finding 3 matches Least-square affine approximation Last part of the talk 8 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. Collecting keypoint 3. Orientation assignment. Keypoint descriptor. 4. candidates… 9 Introduction Detection Description Applications Why Extrema? We want to find points that give us information about the objects in the image. The information about the objects is in the object’s edges. We will represent the image in a way that gives us these edges as this representations extrema points. 10 Introduction Description Detection Applications Scale-space Representation L x, y, G x, y, I x, y G x, y , 1 2 2 e x2 y 2 2 2 11 Introduction Detection Description Applications Scale-space Representation Difference-of-Gaussian (DoG): D x, y, G x, y, k G x, y, I x, y L x, y, k L x, y, Low computation time Only subtraction of smoothed images! 12 Introduction Detection Description Applications DoG Pyramid Here’s what we get: Scale invariance Different frequencies features 13 Introduction Detection Description Applications Extracting Keypoints X is selected if it is larger or smaller than all 26 neighbors. Low cost - only several usually checked. 14 Introduction Detection Description Applications Extracting Keypoints Extrema detection product: 233x189 image => 832 DoG Keypoints. Each Keypoint is represented as x, y, . Not all of them are good… 15 Introduction Detection Description Applications Problematic Keypoints Inaccurate localization (due to scaling and sampling). Low contrast - sensitive to noise. Strong edge responses. 16 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. 3. Orientation assignment. Filtering Keypoints… Keypoint descriptor. 4. 17 Introduction Detection Description Applications Inaccurate Keypoint Localization The Problem: True Extrema Detected Extrema Sampling x 18 Introduction Detection Description Applications Inaccurate Keypoint Localization The Solution: Taylor expansion: DT 1 T 2 DT D x D x x 2 x x 2 x Minimize to find accurate extrema: 1 2 D D xˆ 2 x x If offset from sampling point is larger than 0.5 Keypoint should be in a different sampling point. Brown & Lowe 2002 19 Introduction Detection Description Applications Low Contrast Keypoints Function value at the extremun - D xˆ If Dxˆ 0.03 (pixel values in range [0,1]) keypoint is discarded. => down to 729 Keypoints after min. contrast threshold. 20 Introduction Detection Description Applications Eliminating Edge Responses The Problem: “Edge“ keypoints are poorly determined. Point detection Point can move along edge Point detection => unstable. 21 Introduction Detection Description Applications Eliminating Edge Responses The Solution: Check Keypoints “cornerness”. Point constrained High “cornerness” No dominant principal curvature component. 22 Introduction Detection Description Applications Finding “Cornerness” Principal curvature are proportional to eigenvalues max , min of Hessian matrix: Dxx H Dxy Dxy Dyy Harris (1988) showed: max Tr ( H )2 (r 1)2 r min Det ( H ) r Threshold: if r < 10 - ratio is too great, keypoint discarded. 23 Introduction Detection Description Applications Stable Keypoints => down to 536 Keypoints after “cornerness” threshold. 24 Introduction Detection Description Applications Today’s Talk Orientation Detection Description Extracting Keypoints Distinctive Description Extrema detection Correct localization Choosing robust keypoints only Local invariant orientation Building keypoint descriptor 1st part of the talk 2nd part of the talk DEMO Application Matching Keypoints Nearest-neighbor algorithm Finding 3 matches Least-square affine approximation Last part of the talk 25 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. 3. Orientation assignment. Keypoint descriptor. 4. Representation invariant to Rotation. 26 Introduction Detection Description Applications Gradients For each sample point we compute gradient’s magnitude and orientation: m x, y Lx 1, y Lx 1, y 2 Lx, y 1 L( x, y 1)2 Lx, y 1 L( x, y 1) x, y tan Lx 1, y Lx 1, y 1 27 Introduction Detection Description Applications Keypoints Orientation Create gradient histogram (36 bins) weighted by magnitude and Gaussian window ( is 1.5 times that of the scale of a keypoint) Any histogram peak within 80% of highest peak is assigned to keypoint (multiple assignments possible). 28 Introduction Detection Description Applications SIFT Algorithm 1. Scale-space extrema detection. 2. Keypoint localization. 3. Orientation assignment. Keypoint descriptor. 4. Distinctive (yet invariant) Keypoint Representation. 29 Introduction Detection Description Applications What do we have (and what don’t) Do: For each Keypoint we have assigned location, scale and orientation. Provides invariance to these parameters. Don’t: Sufficient distinctiveness. Invariance to other parameters, such as 3D viewpoint and change of illumination. 30 Introduction Detection Description Applications Keypoint Descriptor Create 16 gradient histograms (8 bins) weighted by magnitude and Gaussian window ( is 0.5 times of the window) Keypoint Descriptor 128 (4x4x8) element vector 31 Introduction Detection Description Applications Change of Illumination Change of brightness => doesn’t effect gradients (difference of pixels value). Change of contrast => doesn’t effect gradients (up to normalization). Saturation (non-linear change of illumination) => affects magnitudes much more than orientation. => Threshold gradient magnitudes to 0.2 and renormalize. 32 Introduction Detection Description Applications Today’s Talk Orientation Detection Description Extracting Keypoints Application Distinctive Description Extrema detection Correct localization Choosing robust keypoints only Local invariant orientation Building keypoint descriptor 1st part of the talk 2nd part of the talk Matching Keypoints Nearest-neighbor algorithm Finding 3 matches Least-square affine approximation Last part of the talk 33 Introduction Detection Description Applications Object Recognition For training images: Extracting keypoints by SIFT. Creating descriptors database. For query images: Extracting keypoints by SIFT. For each descriptor - finding nearest neighbor in DB. Finding cluster of at-least 3 keypoints. Performing detailed geometric fit check for each cluster. 34 Introduction Detection Description Applications Keypoint Matching Best candidate mach: Nearest neighbor. Problem: There are keypoints that do not have correct matches (background clutter/were not detected in training images) 35 Introduction Detection Description Applications Keypoints Matching Solution: Threshold: closest neighbor found nearest neighbor < 0.8 second nearest neighbor Eliminates 90% of false matches while closest neighbor discarding from differentless than 5% of correct matches. object 36 Introduction Detection Description Applications Finding Nearest Neighbor No good algorithm for high dimensional spaces. => Use Best-Bin-First (BBF) algorithm (Beis and Lowe, 1997). => Returns closest neighbor with high probability. Low cost - cutting off search after checking specific number of candidates. Good enough - we only need to show 0.8 ratio between first and second neighbor. 37 Introduction Detection Description Applications Clustering In order to identify an object with high probability we need more than one match: We cluster 3 keypoints using Hough Transform. 38 Introduction Detection Description Applications Geometric Verification Hough Transform found clusters of keypoints. We would like to verify that these points do match geometrically to trained image. In order to do that we use least-squares approach to find affine transformation from training image to query image. Now we can be pretty sure we got the right object! 39 Introduction Detection Description Applications Examples Training images: Query image: Recognition (clutter, occlusion, illumination, etc.): 40 Introduction Detection Description Applications Examples Training images: Query image: Recognition (different viewpoint, non-distinctive): Total time to recognize all object in both examples is less than 0.3 sec. (on 2GHz Pentium 4 processor) 41 Introduction Detection Description Applications Image Registration [Brown & Lowe 2003] 42 Introduction Detection Description Applications Examples Real time Object-Recognition: Recognition DEMO Motion Tracking: Nose DEMO 43 Introduction Detection Description Applications Summary Detection Description Extracting Keypoints Distinctive Description Extrema detection Correct localization Choosing robust keypoints only Local invariant orientation Building keypoint descriptor 1st part of the talk 2nd part of the talk Application Matching Keypoints Nearest-neighbor algorithm Finding 3 matches Least-square affine approximation Last part of the talk 44 Thank you 45