Colorado School of Mines Computer Vision Professor William Hoff Dept of Electrical Engineering &Computer Science Colorado School of Mines Computer Vision http://inside.mines.edu/~whoff/ 1 SIFT Colorado School of Mines Computer Vision 2 SIFT – Scale Invariant Feature Transform • Addresses the problem of matching features with changing scale and rotation • Very successful; experiments have shown it is one of the best approaches for feature matching • Widely used for recognizing objects from image databases Lowe, D. G., “Distinctive Image Features from Scale-Invariant Keypoints”, Int’l Journal of Computer Vision, 60, 2, pp. 91-110, 2004. Colorado School of Mines Computer Vision 3 SIFT – Scale Invariant Feature Transform • Detector – Create a scale space of images • Construct a set of progressively Gaussian blurred images • Take differences to get a “difference of Gaussian” pyramid (similar to a Laplacian of Gaussian) – Find local extrema in this scale-space. Choose keypoints from the extrema • Descriptor – For each keypoint, in a 16x16 window, find histograms of gradient directions – Create a feature vector out of these Colorado School of Mines Computer Vision 4 Scale space images (approximates Laplacian of Gaussian) Colorado School of Mines Computer Vision 5 Automatic Scale Selection • Laplacian of Gaussian at sigma = 2 Colorado School of Mines Computer Vision 6 Automatic Scale Selection [Lindeberg ‘94,‘98] • LoG filter extrema locates “blobs” – maxima = dark blobs on light background – minima = light blobs on dark background • Scale of blob (size ; radius in pixels) is Determined by the sigma parameter of the LoG filter. 7 Colorado School of Mines Computer Vision Automatic Scale Selection Un-normalized Laplacian response Original signal Scale-normalized Laplacian response original signal (radius=8) increasing σ Colorado School of Mines maximum Computer Vision 8 Automatic Scale Selection • Finding the characteristic scale of the blob – by convolving it with Laplacians at several Scales – Non-maximum suppression in scale space. 5 • Find maxima of Laplacian response in scale-space Maximum response at3 4 Lxx ( ) Lyy ( ) 3 2 List of (x, y, σ) Colorado School of Mines Computer Vision 9 Key point localization • Detect maxima and minima of difference-of-Gaussian in scale space • Fit a quadratic to surrounding values for sub-pixel and sub-scale interpolation (Brown & Lowe, 2002) • Taylor expansion around point: Resam ple Blur Subtract • Offset of extremum (use finite differences for derivatives): 10 Colorado School of Mines Computer Vision Select canonical orientation • Create histogram of local gradient directions computed at selected scale • Assign canonical orientation at peak of smoothed histogram • Each key specifies stable 2D coordinates (x, y, scale, orientation) Colorado School of Mines Computer Vision 11 Example of keypoint detection Threshold on value at DOG peak and on ratio of principal curvatures (similar to corner detector approach) (a) 233x189 image (b) 832 DOG extrema (c) 729 left after peak value threshold (d) 536 left after testing ratio of principal curvatures Colorado School of Mines Computer Vision 12 SIFT vector formation • Thresholded image gradients are sampled over 16x16 array of locations in scale space • Create array of orientation histograms • 8 orientations x 4x4 histogram array = 128 dimensions This shows a 2x2 descriptor array computed from an 8x8 set of samples Colorado School of Mines Computer Vision 13 Object recognition using SIFT • Keypoint matching • Efficient nearest neighbor indexing 14 Colorado School of Mines Computer Vision 15 Colorado School of Mines Computer Vision 16