776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2012 From Previous Lecture • • • • Homographies Fundamental matrix Normalized 8-point Algorithm Essential Matrix Plane Homography for Calibrated Cameras • In the calibrated case o Two cameras P=K[I |0] and P’ = K’[R | t] o A plane π=(nT,d) T • The homography is given by x’=Hx H = K’(R – tnT/d)K-1 • For the plane at infinity H = K’RK-1 The Fundamental Matrix F I1 Fm0 m I 0 T 1 1 m1T Fm0 0 F = [e]xH = Fundamental Matrix P0 L m0 M M Epipole l1 e1 m1 e1T F 0 P1 Hm0 The eight-point algorithm x = (u, v, 1)T, x’ = (u’, v’, 1)T Minimize: N T 2 ( x F x ) i i i 1 under the constraint F33 = 1 Epipolar constraint: Calibrated case X x’ x x [t ( R x)] 0 xT E x 0 with E [t ]R Essential Matrix (Longuet-Higgins, 1981) The vectors x, t, and Rx’ are coplanar slide: S. Lazebnik Epipolar constraint: Calibrated case X x’ x x [t ( R x)] 0 xT E x 0 with E [t ]R Cubic constraint det(E ) 0, EE T E 1 trace(EE T )E 0 2 The vectors x, t, and Rx’ are coplanar Essential Matrix E K FK ' T slide: S. Lazebnik Today: Binocular stereo • Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the depth information come from? Binocular stereo • Given a calibrated binocular stereo pair, fuse it to produce a depth image o Humans can do it Stereograms: Invented by Sir Charles Wheatstone, 1838 Binocular stereo • Given a calibrated binocular stereo pair, fuse it to produce a depth image o Humans can do it Autostereograms: www.magiceye.com Binocular stereo • Given a calibrated binocular stereo pair, fuse it to produce a depth image o Humans can do it Autostereograms: www.magiceye.com Real-time stereo Nomad robot searches for meteorites in Antartica http://www.frc.ri.cmu.edu/projects/meteorobot/index.html • Used for robot navigation (and other tasks) o Software-based real-time stereo techniques slide: R. Szeliski Stereo image pair slide: R. Szeliski Anaglyphs http://www.rainbowsymph ony.com/freestuff.html (Wikipedia for images) Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923 slide: R. Szeliski Stereo: epipolar geometry • Match features along epipolar lines epipolar line epipolar plane viewing ray slide: R. Szeliski Simplest Case: Parallel images • Image planes of cameras are parallel to each other and to the baseline • Camera centers are at same height • Focal lengths are the same slide: S. Lazebnik Simplest Case: Parallel images • Image planes of cameras are parallel to each other and to the baseline • Camera centers are at same height • Focal lengths are the same • Then, epipolar lines fall along the horizontal scan lines of the images slide: S. Lazebnik Essential matrix for parallel images Epipolar constraint: x E x 0, E [t ]R T R=I t = (T, 0, 0) x x’ t 0 [a ] a z a y 0 0 E [t ]R 0 0 0 T az 0 ax ay ax 0 0 T 0 Essential matrix for parallel images Epipolar constraint: x E x 0, E [t ]R T R=I t = (T, 0, 0) x x’ t 0 0 u v 10 0 0 T 0 u T v 0 0 1 0 0 E [t ]R 0 0 0 T 0 u v 1 T 0 Tv The y-coordinates of corresponding points are the same! 0 T 0 Tv Tv Depth from disparity X z x x’ f O f Baseline B O’ B f disparity x x z Disparity is inversely proportional to depth! Depth Sampling Depth sampling for integer pixel disparity Quadratic precision loss with depth! Depth Sampling Depth sampling for wider baseline Depth Sampling Depth sampling is in O(resolution6) Stereo: epipolar geometry • for two images (or images with collinear camera centers), can find epipolar lines • epipolar lines are the projection of the pencil of planes passing through the centers • Rectification: warping the input images (perspective transformation) so that epipolar lines are horizontal slide: R. Szeliski Rectification • Project each image onto same plane, which is parallel to the epipole • Resample lines (and shear/stretch) to place lines in correspondence, and minimize distortion • [Loop and Zhang, CVPR’99] slide: R. Szeliski Rectification BAD! slide: R. Szeliski Rectification GOOD! slide: R. Szeliski Problem: Rectification for forward moving cameras • Required image can become very large (infinitely large) when the epipole is in the image • Alternative rectifications are available using epipolar lines directly in the images o Pollefeys et al. 1999, “A simple and efficient method for general motion”, ICCV Your basic stereo algorithm For each epipolar line For each pixel in the left image • compare with every pixel on same epipolar line in right image • pick pixel with minimum match cost Improvement: match windows • This should look familar... slide: R. Szeliski Finding correspondences • apply feature matching criterion (e.g., correlation or Lucas-Kanade) at all pixels simultaneously • search only over epipolar lines (many fewer candidate positions) slide: R. Szeliski Correspondence search Left Right scanline Matching cost disparity • Slide a window along the right scanline and compare contents of that window with the reference window in the left image • Matching cost: SSD or normalized correlation slide: S. Lazebnik Correspondence search Left Right scanline SSD slide: S. Lazebnik Correspondence search Left Right scanline Norm. corr slide: S. Lazebnik Neighborhood size • Smaller neighborhood: more details • Larger neighborhood: fewer isolated mistakes • w=3 w = 20 slide: R. Szeliski Matching criteria • • • • • • • • Raw pixel values (correlation) Band-pass filtered images [Jones & Malik 92] “Corner” like features [Zhang, …] Edges [many people…] Gradients [Seitz 89; Scharstein 94] Rank statistics [Zabih & Woodfill 94] Intervals [Birchfield and Tomasi 96] Overview of matching metrics and their performance: o H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching Costs on Images with Radiometric Differences”, PAMI 2008 slide: R. Szeliski Adaptive Weighting • Boundary Preserving • More Costly Failures of correspondence search Textureless surfaces Occlusions, repetition Non-Lambertian surfaces, specularities slide: S. Lazebnik Stereo: certainty modeling • Compute certainty map from correlations • input depth map certainty map slide: R. Szeliski Results with window search Data Window-based matching Ground truth slide: S. Lazebnik Better methods exist... Graph cuts Ground truth Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001 For the latest and greatest: http://www.middlebury.edu/stereo/ slide: S. Lazebnik How can we improve window-based matching? • The similarity constraint is local (each reference window is matched independently) • Need to enforce non-local correspondence constraints slide: S. Lazebnik Non-local constraints • Uniqueness o For any point in one image, there should be at most one matching point in the other image slide: S. Lazebnik Non-local constraints • Uniqueness o For any point in one image, there should be at most one matching point in the other image • Ordering o Corresponding points should be in the same order in both views slide: S. Lazebnik Non-local constraints • Uniqueness o For any point in one image, there should be at most one matching point in the other image • Ordering o Corresponding points should be in the same order in both views Ordering constraint doesn’t hold slide: S. Lazebnik Non-local constraints • Uniqueness o For any point in one image, there should be at most one matching point in the other image • Ordering o Corresponding points should be in the same order in both views • Smoothness o We expect disparity values to change slowly (for the most part) slide: S. Lazebnik Scanline stereo • Try to coherently match pixels on the entire scanline • Different scanlines are still optimized independently Left image Right image slide: S. Lazebnik “Shortest paths” for scan-line stereo Left image I Right image I S left Right occlusion Coccl Left occlusion q t s p Sright Coccl Ccorr Can be implemented with dynamic programming Ohta & Kanade ’85, Cox et al. ‘96 Slide credit: Y. Boykov Coherent stereo on 2D grid • Scanline stereo generates streaking artifacts • Can’t use dynamic programming to find spatially coherent disparities/ correspondences on a 2D grid slide: S. Lazebnik Stereo matching as energy minimization I2 I1 W1(i ) D W2(i+D(i )) E ( D) W1 (i ) W2 (i D(i )) 2 i D(i ) D(i) D( j ) neighbors i , j data term smoothness term • Energy functions of this form can be minimized using graph cuts Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001 slide: S. Lazebnik Active stereo with structured light • Project “structured” light patterns onto the object o Simplifies the correspondence problem o Allows us to use only one camera camera projector L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002 slide: S. Lazebnik Active stereo with structured light L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002 slide: S. Lazebnik Active stereo with structured light http://en.wikipedia.org/wiki/Structured-light_3D_scanner slide: S. Lazebnik Kinect: Structured infrared light http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/ slide: S. Lazebnik Laser scanning Digital Michelangelo Project Levoy et al. http://graphics.stanford.edu/projects/mich/ • Optical triangulation o Project a single stripe of laser light o Scan it across the surface of the object o This is a very precise version of structured light scanning Source: S. Seitz Laser scanned models The Digital Michelangelo Project, Levoy et al. Source: S. Seitz Laser scanned models The Digital Michelangelo Project, Levoy et al. Source: S. Seitz Laser scanned models The Digital Michelangelo Project, Levoy et al. Source: S. Seitz Laser scanned models The Digital Michelangelo Project, Levoy et al. Source: S. Seitz Laser scanned models 1.0 mm resolution (56 million triangles) The Digital Michelangelo Project, Levoy et al. Source: S. Seitz Aligning range images • A single range scan is not sufficient to describe a complex surface • Need techniques to register multiple range images B. Curless and M. Levoy, A Volumetric Method for Building Complex Models from Range Images, SIGGRAPH 1996 Aligning range images • A single range scan is not sufficient to describe a complex surface • Need techniques to register multiple range images • … which brings us to multi-view stereo