InterestPointsDetection

advertisement
Interest Points Detection
CS485/685 Computer Vision
Dr. George Bebis
Interest Points
• Local features associated with a significant change of an
image property of several properties simultaneously
(e.g., intensity, color, texture).
Why Extract Interest Points?
• Corresponding points (or features) between images enable the
estimation of parameters describing geometric transforms
between the images.
What if we don’t know the correspondences?
?
()
feature
descriptor
?
=
()
feature
descriptor
• Need to compare feature descriptors of local patches
surrounding interest points
What if we don’t know the correspondences?
?
()
feature
descriptor
?
=
()
feature
descriptor
• Lots of possibilities (this is a popular research area)
– Simple option: match square windows around the point
– State of the art approach: SIFT
• David Lowe, UBC http://www.cs.ubc.ca/~lowe/keypoints/
Invariance
• Features should be detected despite geometric or
photometric changes in the image.
• Given two transformed versions of the same image,
features should be detected in corresponding
locations.
How to achieve invariance?
1. Detector must be invariant to geometric and
photometric transformations.
2. Descriptors must be invariant (if matching the
descriptions is required).
Applications
•
•
•
•
•
•
Image alignment
3D reconstruction
Object recognition
Indexing and database retrieval
Object tracking
Robot navigation
Example: Object Recognition
occlusion, clutter
Example: Panorama Stitching
• How do we combine these two images?
Panorama stitching (cont’d)
Step 1: extract features
Step 2: match features
Panorama stitching (cont’d)
Step 1: extract features
Step 2: match features
Step 3: align images
What features should we use?
Use features with gradients in at least two (significantly)
different orientations  e.g., corners
What features should we use? (cont’d)
(auto-correlation)
Corners
• Corners are easier to localize than lines when considering
the correspondence problem (aperture problem).
A point on a line is hard to match.
t
t+1
A corner is easier
t
t+1
Characteristics of good features
• Repeatability
– The same feature can be found in several images despite geometric
and photometric transformations
• Saliency
– Each feature has a distinctive description
• Compactness and efficiency
– Many fewer features than image pixels
• Locality
– A feature occupies a relatively small area of the image; robust to
clutter and occlusion
Mains Steps in Corner Detection
1. For each pixel in the input image, the corner operator is applied
to obtain a cornerness measure for this pixel.
2. Threshold cornerness map to eliminate weak corners.
3. Apply non-maximal suppression to eliminate points whose
cornerness measure is not larger than the cornerness values of
all points within a certain distance.
Mains Steps in Corner Detection (cont’d)
Corner Types
Example of L-junction, Y-junction, T-junction,
Arrow-junction, and X-junction corner types
Corner Detection Methods
• Contour based
– Extract contours and search for maximal curvature or
inflexion points along the contour.
• Intensity based
– Compute a measure that indicates the presence of an interest
point directly from gray (or color) values.
• Parametric model based
– Fit parametric intensity model to the image.
– Can provide sub-pixel accuracy but are limited to specific
types of interest points (e.g., L-corners).
A contour-based approach:
Curvature Scale Space
• Object has been segmented
• Parametric contour representation:
(x(t), y(t))
g(t,σ):
Gaussian
curvature
Curvature Scale Space (cont’d)
σ
G. Bebis, G. Papadourakis and S. Orphanoudakis, "Curvature Scale Space Driven
Object Recognition with an Indexing Scheme based on Artificial Neural Networks",
Pattern Recognition., Vol. 32, No. 7, pp. 1175-1201, 1999.
A parametric model approach:
Zuniga-Haralick Detector
• Approximate image function in the neighborhood of the
pixel (i,j) by a cubic polynomial.
(use SVD to find the coefficients)
measure of "cornerness”:
 2(k 22 k6  k 2 k3k5  k32 k 4 )
ZH (i, j ) 
(k 22  k32 )3 / 2
Corner Detection Using Edge Detection?
• Edge detectors are not stable at corners.
• Gradient is ambiguous at corner tip.
• Discontinuity of gradient direction near corner.
Corner Detection Using Intensity: Basic Idea
• Image gradient has two or more dominant directions near a
corner.
• Shifting a window in any direction should give a large
change in intensity.
“flat” region:
no change in all
directions
“edge”: no change
along the edge
direction
“corner”: significant
change in all
directions
Moravec Detector (1977)
• Measure intensity variation at (x,y) by shifting a
small window (3x3 or 5x5) by one pixel in each of
the eight principle directions (horizontally,
vertically, and four diagonals).
Moravec Detector (1977)
• Calculate intensity variation by taking the sum of squares
of intensity differences of corresponding pixels in these
two windows.
8 directions
∆x, ∆y in {-1,0,1}
SW(-1,-1), SW(-1,0), ...SW(1,1)
Moravec Detector (cont’d)
• The “cornerness” of a pixel is the minimum intensity variation
found over the eight shift directions:
Cornerness(x,y) = min{SW(-1,-1), SW(-1,0), ...SW(1,1)}
Cornerness
Map
(normalized)
Note response to isolated points!
Moravec Detector (cont’d)
• Non-maximal suppression will yield the final corners.
Moravec Detector (cont’d)
• Does a reasonable job in
finding the majority of true
corners.
• Edge points not in one of the
eight principle directions
will be assigned a relatively
large cornerness value.
Moravec Detector (cont’d)
• The response is anisotropic as the intensity variation
is only calculated at a discrete set of shifts (i.e., not
rotationally invariant)
Harris Detector
• Improves the Moravec operator by avoiding the
use of discrete directions and discrete shifts.
• Uses a Gaussian window instead of a square
window.
or
1 in window, 0 outside
Gaussian
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ Proceedings
of the 4th Alvey Vision Conference: pages 147—151, 1988.
Harris Detector (cont’d)
• Using first-order Taylor expansion:
Reminder: Taylor expansion
f´(a)
f´´(a)
f ( x )  f(a) 
( x  a) 
( x  a)2 
1!
2!
f ( n ) (a )

( x  a)n  O( x n 1 )
n!
Harris Detector (cont’d)
Since
Harris Detector (cont’d)
(
AW(x,y)=
f ( xi , yi ) 2
)
x
2 x 2 matrix
(
f ( xi , yi ) 2
)
y
(Hessian or
auto-correlation
or second moment)
Auto-correlation matrix
 f x2
AW   
x, y 
 fx f y
fx f y 
2 
f y 
Describes the gradient
distribution (i.e., local structure)
inside window!
Does not depend on
x, y
Harris Detector (cont’d)
• General case – use window function:
SW (x, y )   w( xi , yi )  f ( xi , yi )  f ( xi  x, yi  y ) 
2
xi , yi
default window function w(x,y) :
1 in window, 0 outside

2
w
(
x
,
y
)
f

x

x, y
AW  

 w( x, y ) f x f y
 x, y

w
(
x
,
y
)
f
f

x y
 f x2
x, y
   w( x, y ) 
 x, y
fx f y

2

w( x, y ) f y 

x, y

fx f y 

f y2 
Harris Detector (cont’d)
• Harris uses a Gaussian window: w(x,y)=G(x,y,σI) where σI
is called the “integration” scale
window function w(x,y) :
Gaussian

2
w
(
x
,
y
)
f

x

x, y
AW  

 w( x, y ) f x f y
 x, y

w
(
x
,
y
)
f
f

x y
 f x2
x, y
   w( x, y ) 
 x, y
fx f y

2

w( x, y ) f y 

x, y

fx f y 

f y2 
Auto-correlation matrix (cont’d)
Harris Detector (cont’d)
Since M is symmetric, we have:
1 0 
AW  R 
R

 0 2 
1
We can visualize AW as an ellipse with axis lengths determined
by the eigenvalues and orientation determined by R
Ellipse equation:
[x y ] AW
 x 
 y   const
 
direction of the
fastest change
(min)-1/2
direction of the
slowest change
(max)-1/2
Harris Detector (cont’d)
• Eigenvectors encode edge direction
• Eigenvalues encode edge strength
direction of the
fastest change
(min)-1/2
direction of the
slowest change
(max)-1/2
Distribution of fx and fy
Distribution of fx and fy (cont’d)
Harris Detector (cont’d)
2
“Edge”
2 >> 1
Classification of
image points using
eigenvalues of AW:
1 and 2 are small;
SW is almost constant
in all directions
“Corner”
1 and 2 are large,
1 ~ 2 ;
E increases in all
directions
“Flat”
region
“Edge”
1 >> 2
1
Harris Detector (cont’d)
%
2
(assuming that 1 > 2)
Harris Detector (cont’d)
• To avoid eigenvalue computation, the following
response function is used:
R(A) = det(A) – k trace2(A)
• It can be shown that:
R(A) = λ1 λ2-k(λ1+ λ2)2
Harris Detector (cont’d)
“Edge”
R<0
R(A) = det(A) – k trace2(A)
“Corner”
R>0
k: is a const, usually
between 0.04 and 0.06
|R| small
“Flat”
region
“Edge”
R<0
Harris Detector (cont’d)
f ( xi , yi ) 
 G ( x, y,  D ) * f ( xi , yi )
x
x
f ( xi , yi ) 
 G( x, y,  D )* f ( xi , yi )
y
y
σD is called the
“differentiation” scale
Harris Detector - Example
Compute corner response R
Harris Detector - Example
Find points with large corner response: R>threshold
Harris Detector - Example
Take only the points of local maxima of R
Harris Detector - Example
Map corners on the original image
Harris Detector - Example
Invariance
• Is the Harris detector invariant to geometric and
photometric changes?
• Features should be detected despite geometric or
photometric changes in the image.
Geometric/Photometric Changes
• Geometric
– Rotation
– Scale
– Affine
• Photometric
– Affine intensity change: I(x,y)  a I(x,y) + b
Harris Detector: Rotation Invariance
• Rotation
Ellipse rotates but its shape (i.e. eigenvalues)
remains the same
Corner response R is invariant to image rotation
Harris Detector: Photometric Changes
• Affine intensity change
 Only derivatives are used =>
invariance to intensity shift I  I + b
 Intensity scale: I(x,y)  a I(x,y)
R
R
threshold
x (image coordinate)
x (image coordinate)
Partially invariant to affine intensity change
Harris Detector: Scale Invariance
• Scaling
Corner
All points will
be classified
as edges
Not invariant to scaling
Harris Detector: Advantages
• Insensitive to:
–
–
–
–
2D shift and rotation
Small illumination variations
Small viewpoint changes
Low computational requirements
Harris Detector: Repeatability
• In a comparative study of different interest point detectors,
it was proven to be the most repeatable and most
informative.
Repeatability =
# correspondences
min(m1 , m2 )
best results:
σI=1, σD=2
C. Schmid, R. Mohr, and C. Bauckhage, "Evaluation of Interest Point Detectors",
International Journal of Computer Vision 37(2), 151.172, 2000.
Harris Detector: Disadvantages
• Sensitive to:
– Scale change
– Viewpoint change
– Significant change in contrast
How to handle scale changes?
•
AW must be adapted to scale changes.
•
If the scale change is known, we can adapt the Harris
detector to the scale change (i.e., set σD).
•
What if the scale change is unknown?
How to handle scale changes? (cont’d)
•
•
•
One solution is to extract interest points at multiple
scales (i.e., vary σD) and use them all.
But then, there will be many points representing the
same structure, complicating matching!
Also, point locations shift as scale increases.
The size of the circle corresponds
to the scale at which the point was
detected
How to handle scale changes? (cont’d)
• Exhaustive search
How to handle scale changes? (cont’d)
•
•
•
Alternatively, use scale selection to find the
characteristic scale of each feature.
Characteristic scale depends on the feature’s spatial
extent (i.e., local neighborhood of pixels).
Only a subset of the points computed in scale space are
selected!
Automatic Scale Selection
• Main idea: select points at which a local measure F(x,σn)
is maximal over scales.
T. Lindeberg (1998). "Feature detection with automatic scale selection"
International Journal of Computer Vision, 30 (2): pp 77--116.
Lindeberg et al, 1996
Slide from Tinne Tuytelaars
Automatic Scale Selection (cont’d)
•
Characteristic scale is
relatively independent
of the image scale.
•
The ratio of the scale
values corresponding to
the max values, is equal
to the scale factor
between the images.
Scale factor: 2.5
Scale selection allows for finding spatial extend that is
covariant with the image transformation.
Invariant / Covariant
• A function f( ) is invariant under some transformation
T( ) if its value does change when the transformation
is applied to its argument:
if f(x) = y then f(T(x))=y
• A function f( ) is covariant when it commutes with
the transformation T( ):
if f(x) = y then f(T(x))=T(f(x))=T(y)
Automatic Scale Selection (cont’d)
• What local measures could be used?
– Should be rotation invariant
– Should have one stable sharp peak
Automatic Scale Selection (cont’d)
• Typically, derivative-based functions:
(s=σ)
• Laplacian yielded best results.
• DoG was second best …
Recall: Edge detection Using 1st derivative
f
d
g
dx
d
f
g
dx
Edge
Derivative
of Gaussian
Edge = maximum
of derivative
Recall: Edge detection Using 2nd derivative
f
d2
g
2
dx
d2
f 2g
dx
Edge
Second derivative
of Gaussian
(Laplacian)
Edge = zero crossing
of second derivative
From edges to blobs
• Edge = ripple
• Blob = superposition of two ripples
maximum
Spatial selection: the magnitude of the Laplacian response will
achieve a maximum (absolute value) at the center of he blob,
provided the scale of the Laplacian is “matched” to the scale of
the blob (e.g, spatial extent)
Scale selection
• Find the characteristic scale of the blob by convolving it
with Laplacian filters at several scales and looking for
the maximum response.
• However, Laplacian response decays as scale increases:
original signal
(radius=8)
increasing σ
Why does this happen?
Scale normalization
• The response of a derivative of Gaussian filter to a
perfect step edge decreases as σ increases
1
 2
Scale normalization (cont’d)
• To keep response the same (scale-invariant), must
multiply Gaussian derivative by σ
• Laplacian is the second Gaussian derivative, so it must
be multiplied by σ2
Effect of scale normalization
Original signal
Unnormalized Laplacian response
Scale-normalized Laplacian response
maximum
Scale selection: case of circle
• At what scale does the Laplacian achieve a maximum
response for a binary circle of radius r?
r
image
Laplacian
Scale selection: case of circle (cont’d)
r
image
 r/ 2
Laplacian response
• The Laplacian achieves a maximum at
r/ 2
scale (σ)
Spatial extent determination
• Spatial extent is taken equal to the neighborhood used
to localize the feature.
• Spatial extent r can be determined from   r / 2
characteristic scale
Harris-Laplace Detector
(1) Compute interest points at multiple scales using the
Harris detector.
- Scales are chosen as follows: σn=knσ0
- At each scale, choose local maxima assuming 3 x 3 window
(sn= σn)
=
K. Mikolajczyk and C. Schmid (2001). “Indexing based on scale invariant
interest points" Int Conference on Computer Vision, pp 525-531.
Harris-Laplace Detector (cont’d)
(2) Select points at which a local measure (i.e., normalized
Laplacian) is maximal over scales.
=
(sn= σn)
K. Mikolajczyk and C. Schmid (2001). “Indexing based on scale invariant
interest points" Int Conference on Computer Vision, pp 525-531.
Example
• Points detected at each scale for two images that differ by a
scale factor 1.92
– Few points detected in the same location but on different scales.
– Many correspondences between levels for which scale ratio
corresponds to the real scale change between images.
Example
The size of the circles corresponds to the scale
at which the point was detected
Harris-Laplace Detector (cont’d)
• These points are invariant to:
– Scale
– Rotation
– Translation
• Robust to:
– Illumination changes
– Limited viewpoint changes
Repeatability
Efficient implementation using DoG
• LoG can be approximated by DoG
G( x, y, k )  G( x, y,  )  (k  1) 22G
• DoG already incorporates the σ2 scale
normalization required for scale-invariance.
Efficient implementation using DoG (cont’d)
•
Given a Gaussian-blurred image
L( x, y,  )  G( x, y,  )  I ( x, y)
the result of convolving an image with a differenceof Gaussian is given by:
D( x, y,  )  (G ( x, y, k )  G ( x, y,  ))  I ( x, y )
 L( x, y, k )  L( x, y,  )
Scale Space Construction
• Convolve the image with Gaussian filters at different
scales.
Scale Space Construction (cont’d)
• Generate difference-of- Gaussian images from the
difference of adjacent blurred images.
Scale Space Construction (cont’d)
• The convolved images are grouped by octave (an
octave corresponds to doubling the value of σ).
– The value of k is selected so that we obtain a fixed number
of blurred images per octave.
– This also ensures that we obtain the same number of
difference-of-Gaussian images per octave.
Scale Space Construction (cont’d)
2k2σ
2kσ
2σ
kσ
σ
David G. Lowe. "Distinctive image features from scale-invariant
keypoints.” IJCV 60 (2), pp. 91-110, 2004.
2kσ
2σ
kσ
σ
Harris-Laplace Detector (cont’d)
•
•
Convolve image with scale-normalized Laplacian at
several scales
Find maxima of normaliz Laplacian response in
scale-space
Using 3 x 3 windows,
compare the pixel at X
to its 26 neighbors
Affine Invariance
• Similarly to characteristic scale selection, detect the
characteristic shape of the local feature.
Affine Adaptation
1. Detect initial region with Harris Laplace
2. Estimate affine shape with M
3. Normalize the affine region to a circular one
4. Re-detect the new location and scale in the normalized
image
5. Go to step 2 if the eigenvalues of the M for the new
point are not equal [detector not yet adapted to the
characteristic shape]
Download