3-D Computer Vision CSc 83029 Stereo CSc83029 3-D Computer Vision / Ioannis Stamos Stereopsis Recovering 3D information (depth) from two images. The correspondence problem. The reconstruction problem. Epipolar constraint. The 8-point algorithm. CSc83029 3-D Computer Vision / Ioannis Stamos The 2 problems of Stereo The setting: Simultaneous acquisition of 2 images (left, right) of a static scene. Correspondence: Which parts of the left and right images are projections of the same scene element? Reconstruction: Given: A number of corresponding points between the left and right image, Information on the geometry of the stereo system, Find: 3-D structure of observed objects. CSc83029 3-D Computer Vision / Ioannis Stamos Stereo Vision depth map CSc83029 3-D Computer Vision / Ioannis Stamos A simple stereo system Fixation Point:Infinity. Parallel optical axes. P Z-axis Z cl cr f X-axis Or Ol Left Camera T Right Camera CSc83029 3-D Computer Vision / Ioannis Stamos Triangulation Fixation Point:Infinity. Parallel optical axes. P Z-axis Z cl pl X-axis Ol cr pr f Or T Left Camera Right Camera Calibrated Cameras CSc83029 3-D Computer Vision / Ioannis Stamos Triangulation cl Fixation Point:Infinity. Parallel optical axes. P Z-axis xl xr Z pl X-axis cr pr f Or Ol T Left Camera Right Camera Calibrated Cameras Similar triangles: T xl xr T T Z f Z Z f d , d xl xr d:disparity (difference in retinal positions). T:baseline. Depth (Z) is inversely proportional to d (fixation at infinity) Triangulation cl Fixation Point:Infinity. Parallel optical axes. P Z-axis xl xr Z pl X-axis cr pr f Or Ol T Left Camera Right Camera Calibrated Cameras Similar triangles: T xl xr T T Z f Z Z f d , d xl xr d:disparity (difference in retinal positions). T:baseline. Baseline T: accuracy/robustness of depth calculation. Triangulation cl xl pl X-axis Fixation Point:Infinity. Parallel optical axes. P Z-axis Z xr cr pr f Or Ol T Left Camera Right Camera Calibrated Cameras Similar triangles: T xl xr T T Z f Z Z f d:disparity (difference in retinal positions). T:baseline. Small baselines: less accurate measurements. d , d xl xr Triangulation cl xl pl X-axis Fixation Point:Infinity. Parallel optical axes. P Z-axis Z xr cr pr f Or Ol T Left Camera Right Camera Calibrated Cameras Similar triangles: T xl xr T T Z f Z Z f d:disparity (difference in retinal positions). T:baseline. Large baselines: occlusions/foreshortening. d , d xl xr Parameters of Stereo System Fixation Point:Infinity. Parallel optical axes. P Z-axis xl cl Z pl X-axis cr pr f Or Ol Left Camera xr T Right Camera 1) Intrinsic parameters (i.e. f, cl, cr) 2) Extrinsic parameters: relative position and orientation of the 2 cameras. STEREO CALIBRATION PROBLEM CSc83029 3-D Computer Vision / Ioannis Stamos Stereo – Photometric Constraint • Same world point has same intensity in both images. • Lambertian fronto-parallel • Issues (noise, specularities, foreshortening) From Jana Kosecka • Difficulties – ambiguities, large changes of appearance, due to change Of viewpoint, non-uniquess Correspondence Is Difficult Ambiguity: there may be many possible 3D reconstructions. CSc83029 3-D Computer Vision / Ioannis Stamos Correspondence Is Difficult No texture: difficult to find a unique match. CSc83029 3-D Computer Vision / Ioannis Stamos Correspondence Is Difficult Foreshortening: the projection in each image is different. CSc83029 3-D Computer Vision / Ioannis Stamos Correspondence Is Difficult Occlusions: there may not be a correspondence. Assumptions: 1) Most scene points are visible from both views. 2) Corresponding image regions are similar. Correspondence Is Difficult Curved surfaces: triangulation produces incorrect position. CSc83029 3-D Computer Vision / Ioannis Stamos Correspondence is difficult: The Ordering Constraint Points appear in the same order But it is not always the case ... CSc83029 3-D Computer Vision / Ioannis Stamos More Correspondence Problems Regions without texture Highly Specular surfaces Translucent objects CSc83029 3-D Computer Vision / Ioannis Stamos Methods For Correspondence Correlation based (dense correspondences). Feature based (such as edges/lines/corners). CSc83029 3-D Computer Vision / Ioannis Stamos Correlation-Based Methods R(pl ) pl Left Image Right Image 1) For each pixel pl in the left image search in a region R(pl) in the right image for corresponding pixel pr. 2) Use image windows of size (2W+1)x(2W+1). 3) Select the pixel pr that maximizes a correlation function. HAVE TO SPECIFY: Region R, size W, and correlation function ψ. Correlation-Based Methods R(pl ) pl Left Image Right Image For each pixel pl=[i,j] in the left image For each displacement d=[d1,d2] in R(pl) W W Compute c(d) ( I l (i k , j l ), Ir (i k d1, j l d 2 )) k W l W The disparity of pl is the d that maximizes c(d) HAVE TO SPECIFY: Region R, size W, and correlation function ψ. Correlation-Based Methods R(pl ) pl Left Image (u, v ) uv ( u, v ) ( u v ) 2 Right Image CROSS-CORRELATION SUM OF SQUARED DIFFERENCES SSD SSD is usually preferred: handles different intensity scales. Normalized cross-correlation is better (but is more expensive). Correspondence Is Difficult Intensities in window may differ. Normalized cross-correlation may help. CSc83029 3-D Computer Vision / Ioannis Stamos Image Normalization Even when the cameras are identical models, there can be differences in gain and sensitivity. The cameras do not see exactly the same surfaces, so their overall light levels can differ. For these reasons and more, it is a good idea to normalize the pixels in each window: I I 1 Wm ( x , y ) Wm ( x , y ) I (u, v) Average pixel ( u ,v )Wm ( x , y ) 2 [ I ( u , v )] Window magnitude ( u ,v )Wm ( x , y ) ˆI ( x, y ) I ( x, y ) I I I W ( x, y ) Normalized pixel m CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Comparing Windows: Minimize Sum of Squared Differences Maximize Cross correlation It is closely related to the SSD: From Jana Kosecka CSc83029 3-D Computer Vision / Ioannis Stamos Region based Similarity Metrics • Sum of squared differences • Normalize cross-correlation • Sum of absolute differences From Jana Kosecka CSc83029 3-D Computer Vision / Ioannis Stamos NCC score for two widely separated views NCC score From Jana Kosecka CSc83029 3-D Computer Vision / Ioannis Stamos Window size W=3 W = 20 Effect of window size Better results with adaptive window • • T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment,, Proc. International Conference on Robotics and Automation, 1991. D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion. International Journal of Computer Vision, 28(2):155-174, July 1998 CSc83029 3-D Computer Vision / Ioannis Stamos (S. Seitz) Stereo results Data from University of Tsukuba Scene Ground truth CSc83029 3-D Computer Vision / Ioannis Stamos (Seitz) Results with window correlation Window-based matching (best window size) Ground truth CSc83029 3-D Computer Vision / Ioannis Stamos (Seitz) Results with better method State of the art Ground truth Boykov et al., Fast Approximate Energy Minimization via Graph Cuts, International Conference on Computer Vision, September 1999. CSc83029 3-D Computer Vision / Ioannis Stamos (Seitz) Feature-Based Methods Left Image Right Image Match sparse sets of extracted features. A feature descriptor for a line could contain: length l, orientation o, midpoint (x,y), average contrast c An example similarity measure (w’s are weights): S 1 2 3-D Computer Vision2/ Ioannis Stamos 2 w0 (ll lr ) 2 w1 (CSc83029 1 2 ) w2 ( ml mr ) w3 ( cl cr ) Correspondence Using Correlation Left Disparity Map Images courtesy of Point Grey Research CSc83029 3-D Computer Vision / Ioannis Stamos Correspondence By Features LEFT IMAGE corner line structure CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Correspondence By Features RIGHT IMAGE corner line structure Search in the right image… the disparity (dx, dy) is the displacement when the similarity measure is maximum From Sebastian Thrun/Jana Kosecka Comparison of Matching Methods Correlation-Based Feature-Based Dense depth maps. Need textured images Sensitive to Sparse depth maps. Insensitive to foreshorening/illumination changes Need close views illumination changes. A-priori info used. Faster. Problems: occlusions/spurious matches: =>Introduce constraints in matching (i.e. left-right consistency constraint) CSc83029 3-D Computer Vision / Ioannis Stamos Epipolar Constraint (Geometry) P Pl EPIPOLAR LINE Image plane πl Scene point Pr EPIPOLAR LINE EPIPOLAR PLANE pr pl el Ol er Image plane πr Or Center of projection Center of projection Epipoles CSc83029 3-D Computer Vision / Ioannis Stamos Epipolar Constraint P Pl EPIPOLAR LINE Image plane πl Scene point Pr EPIPOLAR LINE EPIPOLAR PLANE pr pl Ol el er Image plane πr Or Center of projection Center of projection Epipoles Extrinsic parameters: Left/Right Camera Frames: Pr=R(Pl-T), T=Or-Ol (1) Epipolar Constraint Ol, Or, pl => Scene point P Enough to define right E.L. Pl Pr EPIPOLAR LINE Image plane πl EPIPOLAR LINE EPIPOLAR PLANE pr pl Ol el er Image plane πr Or Center of projection Center of projection Epipoles Given pl, pr is constrained to lie on the Epipolar Line (E.L.). For each left pixel pl, find the corresponding right E.L. Searching for pr reduces to a 1-D problem. Epipolar Constraint Scene points Image plane el er Center of projection Center of projection Epipoles All E.L.s go through epipoles. Parallel image planes => epipoles at infinity. Essential Matrix Estimate the epipolar geometry: correspondence between points and E.L.s. Pl Pl , Pl -T and T are coplanar Pr Pl T ( Pl T )T T Pl 0 (1) ( RT Pr )T T Pl 0 T Link bw/ epipolar constraint and extrinsic parameters of stereo system. Pr EPl 0 T T Pl SPl 0 S TZ TY Pr RT Pl 0 T TZ 0 TX TY TX 0 Pr ( RS ) Pl 0 T Essential Matrix Perspective: pl=[xl,yl,fl]T, pr=[xr,yr,zr]T pl= fl/Zl Pl, pr=fr/Zr Pr Pr EPl 0 T Pr Pl pr Epl 0 T ur pl ul pr el Perspective Projection er T pl and pr are points in camera coordinate s 0 pr R Tz T y T Tz 0 Ty Tx Tx 0 pl 0 E Epipolar lines are found by ur Epl ul E T pr Essential matrix Rank 2 CSc83029 3-D Computer Vision / Ioannis Stamos Camera Models (linear versions) Elegant decomposition. No distortion! Homogeneous Coordinates Measured Pixel (xim, yim) ? World Point (Xw, Yw,Zw) Fundamental Matrix Ml (Mr) matrix of intrinsic parameters for left (right) camera. Camera to pixel coordinates: Pr Pl pl M l pl p r M r pr ur pl ul pr el er T Epipolar lines: ur Fpl ul F T pr Essential matrix equation becomes: pr M rT EM l1 pl 0 T F Fundamental matrix F: pixel coordinates ! E: camera coordinates ! CSc83029 3-D Computer Vision / Ioannis Stamos Conclusions Essential Matrix Fundamental Matrix Encodes information on Encodes information extrinsic parameters. Has rank 2. Its 2 non-zero singular values are equal. on both the extrinsic and intrinsic parameters. Has rank 2. CSc83029 3-D Computer Vision / Ioannis Stamos Estimating the epipolar geometry el er CSc83029 3-D Computer Vision / Ioannis Stamos Estimating the epipolar geometry el er Problem: Find the fundamental matrix from a set of image correspondences p i T r F pil 0 CSc83029 3-D Computer Vision / Ioannis Stamos p , p i l i r Estimating the epipolar geometry el min F er p i T r i F pl 2 i With the respect to the constraint: Rank(F) = 2. CSc83029 3-D Computer Vision / Ioannis Stamos The 8-point algorithm n>=8 correspondences pi r F pil 0 T Av 0 v: the 9 elements of F. A: n x 9 measurement matrix. Solve using SVD (solution up to a scale factor). Enforce rank(F)=2 =>SVD on the computed F. Be careful: numerical instabilities. Epipolar Lines – Example CSc83029 3-D Computer Vision / Ioannis Stamos Example Two views Point Feature Matching From Jana Kosecka Example Epipolar Geometry Camera Pose and Sparse Structure Recovery From Jana Kosecka CSc83029 3-D Computer Vision / Ioannis Stamos Locating the Epipoles from E & F Accurate epipole localization: 1) Refining epipolar lines. 2) Checking for consistency. 3) Uncalibrated stereo. el er F => el, er in pixel coordinates. E => el, er in camera coordinates. Fact: All epipolar lines pass through epipoles. CSc83029 3-D Computer Vision / Ioannis Stamos Image rectification Given general displacement how to warp the views Such that epipolar lines are parallel to each other How to warp it back to canonical configuration (more details later) CSc83029 3-D Computer Vision / Ioannis Stamos (Seitz) Epipolar rectification • Rectified Image Pair • Corresponding epipolar lines are aligned with the scan-lines • Search for dense correspondence is a 1D search CSc83029 3-D Computer Vision / Ioannis Stamos Epipolar rectification Rectified Image Pair CSc83029 3-D Computer Vision / Ioannis Stamos Rectification (Trucco, Ch. 7) Rotate left camera so that epipole goes to infinity (known R, known epipoles) Apply same rotation to right camera Rotate right camera by R Adjust scale in both camera reference frames CSc83029 3-D Computer Vision / Ioannis Stamos Rectification Problem: Epipolar lines not parallel to scan lines P Pl Pr Epipolar Plane Epipolar Lines p Ol p l el er r Or Epipoles CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Rectification Problem: Epipolar lines not parallel to scan lines P Pl Pr Rectified Images Epipolar Plane Epipolar Lines p p l Ol r Or Epipoles at infinity CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka 3-D Reconstruction CSc83029 3-Dby Computer / Ioannis Stamos Reprinted from “Stereo by Intra- and Intet-Scanline Search,” Y. Ohta andVision T. Kanade, IEEE Trans. on Pattern Analysis and Machine Intelligence, 7(2):139-154 (1985). 1985 IEEE. 3-D Reconstruction A Priori Knowledge 3-D Reconstruction from two views Intrinsic and extrinsic Unambiguous (triangulation) Intrinsic only Up to unknown scaling factor No information Up to unknown projective transformation CSc83029 3-D Computer Vision / Ioannis Stamos Projective Reconstruction Euclidean reconstruction Projective reconstruction CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Euclidean vs Projective reconstruction Euclidean reconstruction – true metric properties of objects lenghts (distances), angles, parallelism are preserved Unchanged under rigid body transformations => Euclidean Geometry – properties of rigid bodies under rigid body transformations, similarity transformation Projective reconstruction – lengths, angles, parallelism are NOT preserved – we get distorted images of objects – their distorted 3D counterparts --> 3D projective reconstruction => Projective Geometry CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Check this out! http://www.well.com/user/jimg/stereo/stereo_list.html CSc83029 3-D Computer Vision / Ioannis Stamos How can We Improve Stereo? Space-time stereo scanner uses unstructured light to aid in correspondence Result: Dense 3D mesh (noisy) CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Active Stereo: Adding Texture to Scene By James Davis, Honda Research, Now UCSC CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka rectified Active Stereo (Structured Light) CSc83029 3-D Computer Vision / Ioannis Stamos From Sebastian Thrun/Jana Kosecka Range Images (depth images, depth maps, surface profiles, 2.5-D images) •Sensors that produce depth directly. •Pixel of a range image is the distance between a known reference frame and a visible point in the scene. •Representations: •Cloud of Points (x,y,z) •Rij form (spatial information is explicit) CSc83029 3-D Computer Vision / Ioannis Stamos Active Range Sensors Project energy or control sensor’s parameters. Laser, Radars (accurate)/Sonars(inaccurate). Active Focusing/Defocusing. CSc83029 3-D Computer Vision / Ioannis Stamos Triangulation CSc83029 3-D Computer Vision / Ioannis Stamos Triangulation Light Stripe System. Zc Yc Xc Light Plane: AX+BY+CZ+D=0 (in camera frame) Image Point: x=f X/Z, y=f Y/Z (perspective) Image Triangulation: Z=-D f/(A x + B y + C f) Move light stripeCSc83029 or object. 3-D Computer Vision / Ioannis Stamos Time of Flight CSc83029 3-D Computer Vision / Ioannis Stamos