Slides

advertisement

Distinctive Image Feature from

Scale-Invariant KeyPoints

David G. Lowe, 2004

Presentation Content

• Introduction

• Related Research

• Algorithm

– Keypoint localization

– Orientation assignment

– Keypoint descriptor

• Recognizing images using keypoint descriptors

• Achievements and Results

• Conclusion

Introduction

• Image matching is a fundamental aspect of many problems in computer vision.

Scale Invariant Feature Transform

(SIFT)

• Object or Scene recognition.

• Using local invariant image features.

(keypoints)

– Scaling

– Rotation

– Illumination

– 3D camera viewpoint (affine)

– Clutter / noise

– Occlusion

• Realtime

Related Research

– Corner detectors

• Moravec 1981

• Harris and Stepens 1988

• Harris 1992

• Zhang 1995

• Torr 1995

• Schmid and Mohr 1997

– Scale invariant

• Crowley and Parker 1984

• Shokoufandeh 1999

• Lindeberg 1993, 1994

• Lowe 1999 (this author)

– Invariant to full affine transformation

• Baumberg 2000

• Tuytelaars and Van Gool 2000

• Mikolajczyk and Schmid 2002

• Schaffalitzky and Zisserman 2002

• Brown and Lowe 2002

Keypoint Detection

• Goal: Identify locations and scales that can be repeatably assigned under differing views of the same object.

• Keypoints detection is done at a specific scale and location

• Difference of gaussian function

• Search for stable features across all possible scales

D(x, y, σ) = (G(x, y, kσ) − G(x, y, σ)) ∗ I (x, y)

= L(x, y, kσ) − L(x, y, σ).

σ = amount of smoothing k = constant : 2^(1/s)

KeyPoint Detection

• Reasonably low cost

• Scale sensative

• Number of scale samples per octave?

• 3 scale samples per octave where used (although more is better).

• Determine amount of smoothing (σ)

• Loss of high frequency information so double up

Accurate Keypoint Localization

(1/2)

• Use Taylor expansion to determine the interpolated location of the extrema (local maximum).

Calculate the extrema at this exact location and discart extrema below 3% difference of it surroundings.

Accurate Keypoint Localization

(2/2)

• Eliminating Edge Responses

• Deffine a Hessian matrix with derivatives of pixel values in 4 directions

• Detirmine ratio of maxiumum eigenvalue divided by smaller one.

• #KeyPoints

0 832

729 536

Orientation Assignment

• Caluculate orientation and magnitude of gradients in each pixel

• Histogram of orientations of sample points near keypoint.

• Weighted by its gradient magnitude and by a

Gaussian-weighted circular window with a σ that is

1.5 times that of the scale of the keypoint.

Stable orientation results

• Multiple keypoints for multiple histogram peaks

• Interpolation

The Local Image Discriptor

• We now can find keypoints invariant to location scale and orientation.

• Now compute discriptors for each keypoint.

• Highly distinctive yet invariant for illumination and 3D viewpoint changes.

• Biologically inspired approach.

• Divide sample points around keypoint in 16 regions

(4 regions used in picture)

• Create histogram of orientations of each region (8 bins)

• Trilinear interpolation.

• Vector normalization

Descriptor Testing

This graph shows the percent of keypoints giving the correct match to a database of 40,000 keypoints as a function of width of the n×n keypoint

descriptor and the number of orientations in each histogram. The graph is computed for images with affine viewpoint change of 50 degrees and addition of 4% noise

.

Keypoint Matching

• Look for nearest neighbor in database

(euclidean distance)

• Comparing the distance of the closest neighbor to that of the second-closest neighbor.

• Distance closest / distance second-closest >

0.8 then discard.

Efficient Nearest Neighbor Indexing

.

• 128-dimensional feature vector

• Best-Bin-First (BBF)

• Modified k-d tree algorithm.

• Only find an approximate answer.

• Works well because of 0.8 distance rule.

Clustering with the Hough

Transform

• Select 1% inliers among 99% outliers

• Find clusteres of features that vote for the same object pose.

– 2D location

– Scale

– Orientation

– Location relative to original training image.

• Use broad bin sizes.

Solution for Affine Parameters

• An affine transformation correctly accounts for 3D rotation of a planar surface under orthographic projection, but the approximation can be poor for 3D rotation of non-planar objects.

Basiclly: we do not create a 3D representation of the object.

• The affine transformation of a model point [x

y] to an image point [u v] can be written as

•Outliers are discarded

•New matches can be found by top-down matching

Results

Results

Conclusion

• Invariant to image rotation and scale and robust across a substantial range of affine distortion, addition of noise, and change in illumination.

• Realtime

• Lots of applications

Further Research

• Color

• 3D representation of world.

Download