Stereo Matching Motion and Optic Flow

advertisement
CS 4495 Computer Vision – A. Bobick
Stereo Matching
Motion and Optic Flow
Fundamental matrix
Let p be a point in left image, p’ in right image
l
l’
p
p’
Epipolar relation
• p maps to epipolar line l’
• p’ maps to epipolar line l
Epipolar mapping described by a 3x3 matrix F
It follows that
Fundamental matrix
This matrix F is called
• the “Essential Matrix”
– when image intrinsic parameters are known
• the “Fundamental Matrix”
– more generally (uncalibrated case)
Can solve for F from point correspondences
• Each (p, p’) pair gives one linear equation in entries of F
• F has 9 entries, but really only 7 or 8 degrees of freedom.
• With 8 points it is simple to solve for F, but it is also possible
with 7. See Marc Pollefey’s notes for a nice tutorial
Stereo image rectification
Stereo image rectification
• Reproject image planes
onto a common plane
parallel to the line
between camera centers
• Pixel motion is horizontal
after this transformation
• Two homographies (3x3
transform), one for each
input image reprojection
 C. Loop and Z. Zhang. Computing
Rectifying Homographies for Stereo
Vision. IEEE Conf. Computer Vision
and Pattern Recognition, 1999.
Rectification example
The correspondence problem
• Epipolar geometry constrains our search, but
we still have a difficult correspondence
problem.
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correspondence problem
Multiple match
hypotheses
satisfy epipolar
constraint, but
which is correct?
Figure from Gee & Cipolla 1999
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correspondence problem
• Beyond the hard constraint of epipolar geometry, there
are “soft” constraints to help identify corresponding points
• Similarity
• Uniqueness
• Ordering
• Disparity gradient
• To find matches in the image pair, we will assume
• Most scene points visible from both views
• Image regions for the matches are similar in appearance
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Dense correspondence search
For each epipolar line
For each pixel / window in the left image
• compare with every pixel / window on same epipolar line
in right image
• pick position with minimum match cost (e.g., SSD,
normalized correlation)
Adapted from Li Zhang
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Correspondence search with similarity constraint
Left
Right
scanline
Matching cost
disparity
• Slide a window along the right scanline and compare
contents of that window with the reference window in
the left image
• Matching cost: SSD or normalized correlation
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Correspondence search with similarity constraint
Left
Right
scanline
SSD
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Correspondence search with similarity constraint
Left
Right
scanline
Norm. corr
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correspondence problem
Intensity
profiles
Source: Andrew Zisserman
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correspondence problem
Neighborhoods of corresponding points are
similar in intensity patterns.
Source: Andrew Zisserman
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correlation-based window matching
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correlation-based window matching
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correlation-based window matching
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Correlation-based window matching
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Correlation-based window matching
???
Textureless regions are
non-distinct; high
ambiguity for matches.
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Effect of window size
Source: Andrew Zisserman
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Effect of window size
W=3
W = 20
Want window large enough to have sufficient intensity
variation, yet small enough to contain only pixels with
about the same disparity.
Figures from Li Zhang
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Results with window search
Window-based matching
(best window size)
Ground truth
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Better solutions
• Beyond individual correspondences to estimate
disparities:
• Optimize correspondence assignments jointly
• Scanline at a time (DP)
• Full 2D grid (graph cuts)
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Scanline stereo
• Try to coherently match pixels on the entire scanline
• Different scanlines are still optimized independently
intensity
Left image
Right image
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Coherent stereo on 2D grid
• Scanline stereo generates streaking artifacts
• Can’t use dynamic programming to find spatially
coherent disparities/ correspondences on a 2D grid
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Stereo as energy minimization
• What defines a good stereo correspondence?
1. Match quality
•
2.
Want each pixel to find a good match in the other image
Smoothness
•
If two pixels are adjacent, they should (usually) move about
the same amount
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Stereo matching as energy minimization
I2
I1
W1(i)
D
W2(i+D(i))
D(i)
E   Edata ( I1 , I 2 , D)   Esmooth ( D)
Edata   W1 (i)  W2 (i  D(i))
2
i
Esmooth 
  D(i)  D( j)
neighborsi , j
• Energy functions of this form can be minimized using
graph cuts
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate
Energy Minimization via Graph Cuts, PAMI 2001 Source: Steve Seitz
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Better results…
State of the art method
Ground truth
Boykov et al., Fast Approximate Energy Minimization via Graph Cuts,
International Conference on Computer Vision, September 1999.
For the latest and greatest: http://www.middlebury.edu/stereo/
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Challenges
• Low-contrast ; textureless image regions
• Occlusions
• Violations of brightness constancy (e.g., specular
reflections)
• Really large baselines (foreshortening and appearance
change)
• Camera calibration errors
Active stereo with structured light
• Project “structured” light patterns onto the object
• Simplifies the correspondence problem
• Allows us to use only one camera
camera
projector
L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured
Light and Multi-pass Dynamic Programming. 3DPVT 2002
Kinect: Structured infrared light
http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/
Summary
• Epipolar geometry
– Epipoles are intersection of baseline with image planes
– Matching point in second image is on a line passing
through its epipole
– Fundamental matrix maps from a point in one image to a
line (its epipolar line) in the other
– Can solve for F given corresponding points (e.g., interest
points)
• Stereo depth estimation
– Estimate disparity by finding corresponding points along
scanlines
– Depth is inverse to disparity
CS 4495 Computer Vision – A. Bobick
5 minute break
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Computer Vision
Motion and Optic Flow
Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others…
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Video
• A video is a sequence of frames captured over time
• Now our image data is a function of space
(x, y) and time (t)
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion Applications: Segmentation of video
• Background subtraction
• A static camera is observing a scene
• Goal: separate the static background from the moving foreground
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion Applications: Segmentation of video
• Background subtraction
• Shot boundary detection
• Commercial video is usually composed of shots or sequences
showing the same objects or scene
• Goal: segment video into shots for summarization and browsing
(each shot can be represented by a single keyframe in a user
interface)
• Difference from background subtraction: the camera is not
necessarily stationary
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion Applications: Segmentation of video
• Background subtraction
• Shot boundary detection
• Motion segmentation
• Segment the video into multiple coherently moving objects
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion Applications: Segmentation of video
• Background subtraction
• Shot boundary detection
• Motion segmentation
• Segment the video into multiple coherently moving objects
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
Gestalt psychology
(Max Wertheimer,
1880-1943)
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Sometimes, motion is the only cue
Gestalt psychology
(Max Wertheimer,
1880-1943)
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Sometimes, motion is the only cue
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Sometimes, motion is the only cue
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Sometimes, motion is the only cue
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Even “impoverished” motion data can evoke a strong
percept
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
• Even “impoverished” motion data can evoke a strong
percept
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion and perceptual organization
Experimental study of apparent behavior.
Fritz Heider & Marianne Simmel. 1944
CS 4495 Computer Vision – A. Bobick
More applications of motion
• Segmentation of objects in space or time
• Estimating 3D structure
• Learning dynamical models – how things move
• Recognizing events and activities
• Improving video quality (motion stabilization)
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion estimation techniques
• Feature-based methods
• Extract visual features (corners, textured areas) and track them
over multiple frames
• Sparse motion fields, but more robust tracking
• Suitable when image motion is large (10s of pixels)
• Direct, dense methods
• Directly recover image motion at each pixel from spatio-temporal
image brightness variations
• Dense motion fields, but sensitive to appearance variations
• Suitable for video and when image motion is small
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Motion estimation: Optical flow
Optic flow is the apparent motion of objects or surfaces
Will start by estimating motion of each pixel separately
Then will consider motion of entire image
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Problem definition: optical flow
I ( x, y , t )
I ( x, y, t  1)
How to estimate pixel motion from image I(x,y,t) to I(x,y,t+1) ?
• Solve pixel correspondence problem
– given a pixel in I(x,y,t), look for nearby pixels of the same color in I(x,y,t+1)
Key assumptions
• color constancy: a point in I(x,y, looks the same in I(x,y,t+1)
– For grayscale images, this is brightness constancy
• small motion: points do not move very far
This is called the optical flow problem
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Optical flow constraints (grayscale images)
I ( x, y, t  1)
I ( x, y, t )
• Let’s look at these constraints more closely
• brightness constancy constraint (equation)
I ( x, y, t )  I ( x  u , y  v, t  1)
• small motion: (u and v are less than 1 pixel, or smooth)
Taylor series expansion of I:
I
I
u  v  [higher order terms]
x
y
I
I
 I ( x, y )  u  v
x
y
I ( x  u , y  v )  I ( x, y ) 
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Optical flow equation
• Combining these two equations
0  I ( x  u , y  v, t  1)  I ( x, y, t )
 I ( x, y, t  1)  I x u  I y v  I ( x, y, t )
𝜕𝐼
(Short hand: 𝐼𝑥 = 𝜕𝑥
for t or t+1)
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Optical flow equation
• Combining these two equations
0  I ( x  u , y  v, t  1)  I ( x, y, t )
 I ( x, y, t  1)  I xu  I y v  I ( x, y, t )
 [ I ( x, y, t  1)  I ( x, y, t )]  I xu  I y v
 It  I xu  I y v
 I t  I   u , v 
𝜕𝐼
(Short hand: 𝐼𝑥 = 𝜕𝑥
for t or t+1)
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Optical flow equation
• Combining these two equations
0  I ( x  u , y  v, t  1)  I ( x, y, t )
 I ( x, y, t  1)  I xu  I y v  I ( x, y, t )
 [ I ( x, y, t  1)  I ( x, y, t )]  I xu  I y v
 It  I xu  I y v
 I t  I   u , v 
In the limit as u and v go to zero, this becomes exact
0  I t  I   u , v 
Brightness constancy constraint equation
I x u  I y v  It  0
𝜕𝐼
(Short hand: 𝐼𝑥 = 𝜕𝑥
for t or t+1)
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
How does this make sense?
Brightness constancy constraint equation
I x u  I y v  It  0
• What do the static image gradients have to do with motion
estimation?
The brightness constancy constraint
Can we use this equation to recover image motion (u,v) at
each pixel?
0  It  I   u, v  or I x u  I y v  I t  0
• How many equations and unknowns per pixel?
•One equation (this is a scalar equation!), two unknowns (u,v)
The component of the motion perpendicular to the
gradient (i.e., parallel to the edge) cannot be measured
If (u, v) satisfies the equation,
so does (u+u’, v+v’ ) if
gradient
(u,v)
I  u' v'  0
T
(u’,v’)
(u+u’,v+v’)
edge
CS 4495 Computer Vision – A. Bobick
Aperture problem
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Aperture problem
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Aperture problem
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Apparently an aperture problem
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
The barber pole illusion
http://en.wikipedia.org/wiki/Barberpole_illusion
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
The barber pole illusion
http://en.wikipedia.org/wiki/Barberpole_illusion
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Solving the ambiguity…
B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In
Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679, 1981.
• How to get more equations for a pixel?
• Spatial coherence constraint
•
Assume the pixel’s neighbors have the same (u,v)
• If we use a 5x5 window, that gives us 25 equations per pixel
CS 4495 Computer Vision – A. Bobick
Solving the ambiguity…
• Least squares problem:
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Matching patches across images
• Overconstrained linear system
Least squares solution for d given by
The summations are over all pixels in the K x K window
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Conditions for solvability
Optimal (u, v) satisfies Lucas-Kanade equation
When is this solvable? I.e., what are good points to
track?
• ATA should be invertible
• ATA should not be too small due to noise
– eigenvalues 1 and  2 of ATA should not be too small
• ATA should be well-conditioned
–  1/  2 should not be too large ( 1 = larger eigenvalue)
Does this remind you of anything?
Criteria for Harris corner detector
CS 4495 Computer Vision – A. Bobick
Low texture region
– gradients have small magnitude
– small 1, small 2
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Edge
– large gradients, all the same
– large 1, small 2
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
High textured region
– gradients are different, large magnitudes
– large 1, large 2
Motion and Optic Flow
The aperture problem resolved
Actual motion
The aperture problem resolved
Perceived motion
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Errors in Lucas-Kanade
• A point does not move like its neighbors
• Motion segmentation
• Brightness constancy does not hold
• Do exhaustive neighborhood search with normalized correlation tracking features – maybe SIFT – more later….
• The motion is large (larger than a pixel)
1. Not-linear: Iterative refinement
2. Local minima: coarse-to-fine estimation
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Revisiting the small motion assumption
• Is this motion small enough?
• Probably not—it’s much larger than one pixel
• How might we solve this problem?
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Optical Flow: Aliasing
Temporal aliasing causes ambiguities in optical flow because
images can have many pixels with the same intensity.
I.e., how do we know which ‘correspondence’ is correct?
actual shift
estimated shift
nearest match is correct
(no aliasing)
nearest match is incorrect
(aliasing)
To overcome aliasing: coarse-to-fine estimation.
CS 4495 Computer Vision – A. Bobick
Reduce the resolution!
Motion and Optic Flow
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Coarse-to-fine optical flow estimation
u=1.25 pixels
u=2.5 pixels
u=5 pixels
image 11
image
Gaussian pyramid of image 1
u=10 pixels
image 2
Gaussian pyramid of image 2
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
Coarse-to-fine optical flow estimation
run iterative L-K
warp & upsample
run iterative L-K
.
.
.
image J1
Gaussian pyramid of image 1
image I2
image
Gaussian pyramid of image 2
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Optical Flow Results
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
CS 4495 Computer Vision – A. Bobick
Motion and Optic Flow
Optical Flow Results
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Motion and Optic Flow
CS 4495 Computer Vision – A. Bobick
State-of-the-art optical flow
Start with something similar to Lucas-Kanade
+ gradient constancy
+ energy minimization with smoothing term
+ region matching
+ keypoint matching (long-range)
Region-based
+Pixel-based +Keypoint-based
Large displacement optical flow, Brox et al., CVPR 2009
Optical flow
• Definition: optical flow is the apparent motion
of brightness patterns in the image
• Ideally, optical flow would be the same as the
motion field
• Have to be careful: apparent motion can be
caused by lighting changes without any actual
motion
– Think of a uniform rotating sphere under fixed
lighting vs. a stationary sphere under moving
illumination
Download