lec05_alignment_jared_heinly

advertisement
776 Computer
Vision
Jared Heinly
Spring 2014
(slides borrowed from Jan-Michael Frahm, Svetlana Lazebnik, and others)
Image alignment
Image from http://graphics.cs.cmu.edu/courses/15-463/2010_fall/
A look into the past
http://blog.flickr.net/en/2010/01/27/a-look-into-the-past/
A look into the past
• Leningrad during the blockade
http://komen-dant.livejournal.com/345684.html
Bing streetside images
http://www.bing.com/community/blogs/maps/archive/2010/01/12/new-bing-mapsapplication-streetside-photos.aspx
Image alignment: Applications
Panorama stitching
Recognition
of object
instances
Image alignment: Challenges
Small degree of overlap
Intensity changes
Occlusion,
clutter
Image alignment
• Two families of approaches:
o Direct (pixel-based) alignment
• Search for alignment where most pixels agree
o Feature-based alignment
• Search for alignment where extracted features agree
• Can be verified using pixel-based alignment
2D transformation models
• Similarity
(translation,
scale, rotation)
• Affine
• Projective
(homography)
Let’s start with affine transformations
• Simple fitting procedure (linear least squares)
• Approximates viewpoint changes for roughly planar
objects and roughly orthographic cameras
• Can be used to initialize fitting for more complex
models
Fitting an affine transformation
• Assume we know the correspondences, how do we
get the transformation?
( xi , yi )
 xi   m1
 y   m
 i  3
m2   xi   t1 
 



m4   yi  t2 
( xi, yi)

x
 i
0


yi
0

0 0
xi yi

 m1 
 
 m2  
1 0  m3   xi 
 
0 1 m4   yi 
   
 t1

 
 t 2 
Fitting an affine transformation

x
 i
0


yi
0

0 0
xi yi

 m1 
 
 m2  
1 0  m3   xi 
 
0 1 m4   yi 
   
 t1

 
 t 2 
• Linear system with six unknowns
• Each match gives us two linearly independent
equations: need at least three to solve for the
transformation parameters
Fitting a plane projective transformation
• Homography: plane projective transformation
(transformation taking a quad to another arbitrary
quad)
Homography
• The transformation between two views of a planar
surface
• The transformation between images from two
cameras that share the same center
Application: Panorama stitching
Source: Hartley & Zisserman
Fitting a homography
• Recall: homogeneous coordinates
Converting to homogeneous
image coordinates
Converting from homogeneous
image coordinates
Fitting a homography
• Recall: homogeneous coordinates
Converting to homogeneous
image coordinates
Converting from homogeneous
image coordinates
• Equation for homography:
 x  h11 h12




  y   h21 h22
 1  h31 h32
h13   x 



h23   y 
h33   1 
Fitting a homography
• Equation for homography:
 xi   h11 h12
  yi   h21 h22
 1  h31 h32
Unknown scale
h13   xi 
h23   yi 
h33   1 
Rows of H
 xi  H xi
Scaled versions
of same vector
xi  H xi  0
T
T
T






x
h
x
y
h
x

h
 i
1 i
i 3 i
2 xi
 y   hT x    hT x  x hT x 
 i  2 i  1 i i 3 i 
 1  hT3 x i   xi hT2 x i  yi h1T x i 
Each element is a 3-vector
3 equations,
only 2 linearly
independent
xi'  0T
 T
yi'  x i
 yi xTi

x
0T
xi xTi
T
i
Rows of H formed into
yi xTi  h1  9x1 column vector
 
T
 xi x i  h 2   0
0T  h 3 
Direct linear transform
 0T
 T
x1

 T
0
 xT
 n
x1T  y1 x1T 
T
T 
0
 x1 x1  h1 
 

  h 2   0
T
T 
x n  yn x n  h 3 
0T  xn xTn 
Ah  0
• H has 8 degrees of freedom (9 parameters, but
scale is arbitrary)
• One match gives us two linearly independent
equations
• Four matches needed for a minimal solution (null
space of 8x9 matrix)
• More than four: homogeneous least squares
Robust feature-based alignment
• So far, we’ve assumed that we are given a set of
“ground-truth” correspondences between the two
images we want to align
• What if we don’t know the correspondences?
( xi , yi )
( xi, yi)
Robust feature-based alignment
• So far, we’ve assumed that we are given a set of
“ground-truth” correspondences between the two
images we want to align
• What if we don’t know the correspondences?
?
Robust feature-based alignment
Robust feature-based alignment
• Extract features
Robust feature-based alignment
• Extract features
• Compute putative matches
Alignment as fitting
• Least Squares
• Total Least Squares
Least Squares Line Fitting
•Data: (x1, y1), …, (xn, yn)
•Line equation: yi = m xi + b
•Find (m, b) to minimize
E  i 1 ( yi  m xi  b) 2
n
y=mx+b
(xi, yi)
Problem with “Vertical” Least Squares
• Not rotation-invariant
• Fails completely for vertical lines
Total Least Squares
•Distance between point (xi, yi)
and line ax+by=d (a2+b2=1):
|axi + byi – d|
•Find (a, b, d) to minimize the sum
of squared perpendicular
distances
E  i 1 (a xi  b yi  d ) 2
n
ax+by=d
Unit normal:
N=(a, b)
E   (a x(x
 b y, yd ))
i i
n
i 1
2
i
i
Least Squares: Robustness to Noise
• Least squares fit to the red points:
Least Squares: Robustness to Noise
• Least squares fit with an outlier:
Problem: squared error heavily penalizes outliers
Robust Estimators
• General approach: find model parameters θ that minimize
•
i  ri xi , ; 
ri (xi, θ) – residual of ith point w.r.t. model parameters θ
ρ – robust function with scale parameter σ
The robust function
ρ behaves like
squared distance for
small values of the
residual u but
saturates for larger
values of u
Choosing the scale: Just right
The effect of the outlier is minimized
Choosing the scale: Too small
The error value is almost the same for every
point and the fit is very poor
Choosing the scale: Too large
Behaves much the same as least squares
RANSAC
• Robust fitting can deal with a few outliers – what
if we have very many?
• Random sample consensus (RANSAC):
Very general framework for model fitting in the
presence of outliers
• Outline
o Choose a small subset of points uniformly at random
o Fit a model to that subset
o Find all remaining points that are “close” to the model and reject the
rest as outliers
o Do this many times and choose the best model
M. A. Fischler, R. C. Bolles. Random Sample Consensus: A Paradigm for Model
Fitting with Applications to Image Analysis and Automated Cartography. Comm. of
the ACM, Vol 24, pp 381-395, 1981.
RANSAC for line fitting example
Source: R. Raguram
RANSAC for line fitting example
Least-squares fit
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
4. Select points
consistent with
model
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
4. Select points
consistent with
model
5. Repeat
hypothesize-andverify loop
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
4. Select points
consistent with
model
5. Repeat
hypothesize-andverify loop
43
Source: R. Raguram
RANSAC for line fitting example
Uncontaminated sample
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
4. Select points
consistent with
model
5. Repeat
hypothesize-andverify loop
44
Source: R. Raguram
RANSAC for line fitting example
1. Randomly select
minimal subset
of points
2. Hypothesize a
model
3. Compute error
function
4. Select points
consistent with
model
5. Repeat
hypothesize-andverify loop
Source: R. Raguram
RANSAC for line fitting
•
•
•
•
•
Repeat N times:
Draw s points uniformly at random
Fit line to these s points
Find inliers to this line among the remaining points
(i.e., points whose distance from the line is less
than t)
If there are d or more inliers, accept the line and
refit using all inliers
Choosing the parameters
• Initial number of points s
o Typically minimum number needed to fit the model
• Distance threshold t
o Choose t so probability for inlier is p (e.g. 0.95)
o Zero-mean Gaussian noise with std. dev. σ: t2=3.84σ2
• Number of samples N
o Choose N so that, with probability p, at least one random
sample is free from outliers (e.g. p=0.99) (outlier ratio: e)
Source: M. Pollefeys
Choosing the parameters
• Initial number of points s
• Typically minimum number needed to fit the model
• Distance threshold t
• Choose t so probability for inlier is p (e.g. 0.95)
• Zero-mean Gaussian noise with std. dev. σ: t2=3.84σ2
• Number of samples N
• Choose N so that, with probability p, at least one random
sample is free from outliers (e.g. p=0.99) (outlier ratio: e)
1  1  e 
s N
proportion of outliers e
 1 p

N  log 1  p  / log 1  1  e 
s

s
2
3
4
5
6
7
8
5%
2
3
3
4
4
4
5
10%
3
4
5
6
7
8
9
20% 25% 30% 40% 50%
5
6
7
11
17
7
9
11
19
35
9
13
17
34
72
12
17
26
57
146
16
24
37
97
293
20
33
54
163 588
26
44
78
272 1177
Source: M. Pollefeys
Adaptively determining the number of samples
• Inlier ratio e is often unknown a priori, so pick worst
case, e.g. 50%, and adapt if more inliers are found,
e.g. 80% would yield e=0.2
• Adaptive procedure:
o N=∞, sample_count =0
o While N >sample_count
• Choose a sample and count the number of inliers
• If inlier ratio is highest of any found so far, set
e = 1 – (number of inliers)/(total number of points)
• Recompute N from e:

N  log 1  p  / log 1  1  e 
s

• Increment the sample_count by 1
Source: M. Pollefeys
RANSAC pros and cons
• Pros
o Simple and general
o Applicable to many different problems
o Often works well in practice
• Cons
o Lots of parameters to tune
o Doesn’t work well for low inlier ratios (too many iterations,
or can fail completely)
o Can’t always get a good initialization
of the model based on the minimum
number of samples
Alignment as fitting
• Previous lectures: fitting a model to features in one
image
M
xi
Find model M that minimizes
 residual ( x , M )
i
i
Alignment as fitting
• Previous lectures: fitting a model to features in one
image
M
Find model M that minimizes
xi
 residual ( x , M )
i
i
• Alignment: fitting a model to a transformation
between pairs of features (matches) in two images
xi
T
xi'
Find transformation T
that minimizes
 residual (T ( x ), x)
i
i
i
Robust feature-based alignment
• Extract features
• Compute putative matches
• Loop:
o
Hypothesize transformation T
Robust feature-based alignment
• Extract features
• Compute putative matches
• Loop:
o
o
Hypothesize transformation T
Verify transformation (search for other matches consistent with T)
Robust feature-based alignment
• Extract features
• Compute putative matches
• Loop:
o
o
Hypothesize transformation T
Verify transformation (search for other matches consistent with T)
RANSAC
•
The set of putative matches contains a very high
percentage of outliers
•
1.
2.
3.
4.
RANSAC loop:
Randomly select a seed group of matches
Compute transformation from seed group
Find inliers to this transformation
If the number of inliers is sufficiently large, recompute least-squares estimate of transformation
on all of the inliers
•
Keep the transformation with the largest number
of inliers
RANSAC example: Translation
Putative matches
RANSAC example: Translation
Select one match, count inliers
RANSAC example: Translation
Select one match, count inliers
RANSAC example: Translation
Select translation with the most inliers
Summary
•
•
•
•
Detect feature points (to be covered later)
Describe feature points (to be covered later)
Match feature points (to be covered later)
RANSAC
o Transformation model (affine, homography, etc.)
Advice
• Use the minimum model necessary to define the
transformation on your data
o Affine in favor of homography
o Homography in favor of 3D transform (not discussed yet)
o Make selection based on your data
• Fewer RANSAC iterations
• Fewer degrees of freedom  less ambiguity or
sensitivity to noise
Generating putative correspondences
?
Generating putative correspondences
?
()
feature
descriptor
?
=
()
feature
descriptor
• Need to compare feature descriptors of local
patches surrounding interest points
Feature descriptors
• Simplest descriptor: vector of raw intensity values
• How to compare two such vectors?
o Sum of squared differences (SSD)
SSD(u , v)   ui  vi 
2
i
• Not invariant to intensity change
o Normalized correlation
 (u, v) 
 (u
i
i
 u )(vi  v )



  (u j  u ) 2   (v j  v ) 2 



j
j



• Invariant to affine intensity change
Disadvantage of intensity vectors as descriptors
• Small deformations can affect the
matching score a lot
Feature descriptors: SIFT
• Descriptor computation:
o Divide patch into 4x4 sub-patches
o Compute histogram of gradient orientations (8 reference angles) inside
each sub-patch
o Resulting descriptor: 4x4x8 = 128 dimensions
David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60
(2), pp. 91-110, 2004.
Feature descriptors: SIFT
• Descriptor computation:
o Divide patch into 4x4 sub-patches
o Compute histogram of gradient orientations (8 reference angles) inside
each sub-patch
o Resulting descriptor: 4x4x8 = 128 dimensions
• Advantage over raw vectors of pixel values
o Gradients less sensitive to illumination change
o Pooling of gradients over the sub-patches achieves robustness to small
shifts, but still preserves some spatial information
David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60
(2), pp. 91-110, 2004.
Feature matching
• Generating putative matches: for each patch in
one image, find a short list of patches in the other
image that could match it based solely on
appearance
?
Feature space outlier rejection
• How can we tell which putative matches are more
reliable?
• Heuristic: compare distance of nearest neighbor to
that of second nearest neighbor
o Ratio of closest distance to second-closest distance will be high
for features that are not distinctive
Threshold of 0.8 provides
good separation
David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60
(2), pp. 91-110, 2004.
Download