CS 558 COMPUTER VISION Lecture IX: Dimensionality Reduction CS 558 COMPUTER VISION Supplementary Lecture: Single View and Epipolar Geometry Slide adapted from S. Lazebnik OUTLINE Single view geometry Epipolar geometry SINGLE-VIEW GEOMETRY Odilon Redon, Cyclops, 1914 OUR GOAL: RECOVERY OF 3D STRUCTURE • Recovery of structure from one image is inherently ambiguous X? x X? X? OUR GOAL: RECOVERY OF 3D STRUCTURE • Recovery of structure from one image is inherently ambiguous OUR GOAL: RECOVERY OF 3D STRUCTURE • Recovery of structure from one image is inherently ambiguous AMES ROOM http://en.wikipedia.org/wiki/Ames_room OUR GOAL: RECOVERY OF 3D STRUCTURE • We will need multi-view geometry RECALL: PINHOLE CAMERA MODEL • • Principal axis: line from the camera center perpendicular to the image plane Normalized (camera) coordinate system: camera center is at the origin and the principal axis is the zaxis RECALL: PINHOLE CAMERA MODEL ( X ,Y , Z ) ( f X / Z , f Y / Z ) x PX X f X f Y Z fY Z 1 f X 0 Y 0 Z 1 0 1 PRINCIPAL POINT py px • • • • Principal point (p): point where principal axis intersects the image plane (origin of normalized coordinate system). Normalized coordinate system: origin is at the principal point. Image coordinate system: origin is in the corner. How to go from normalized coordinate system to image coordinate system? PRINCIPAL POINT OFFSET principal point: py ( px , p y ) px ( X , Y , Z ) ( f X / Z px , f Y / Z p y ) X f X Z px f Y Z f Y Z py Z 1 f px py 1 X 0 Y 0 Z 0 1 PRINCIPAL POINT OFFSET principal point: f X Zp x f f Y Zp y Z f K f P KI | 0 f ( px , p y ) X p x 1 0 Y py 1 0 Z 1 1 0 1 px p y calibration matrix 1 PIXEL COORDINATES 1 1 Pixel size: mx m y mx pixels per meter in horizontal direction, my pixels per meter in vertical direction mx K my pixels/m f 1 f m px x p y y 1 pixels x y 1 CAMERA ROTATION AND TRANSLATION • ~ ~ ~ Xcam R X - C coords. of point in camera frame In general, the camera coordinate frame will be related to the world coordinate frame by a rotation and a translation coords. of camera center in world frame coords. of a point in world frame (nonhomogeneous) CAMERA ROTATION AND TRANSLATION In non-homogeneous coordinates: ~ ~ ~ Xcam R X - C X cam x KI | 0Xcam R 0 ~ ~ RC X R 1 1 0 ~ K R | RC X ~ RC X 1 P KR | t , ~ t RC Note: C is the null space of the camera projection matrix (PC=0) CAMERA PARAMETERS • Intrinsic parameters Principal point coordinates Focal length Pixel magnification factors Skew (non-rectangular pixels) Radial distortion mx K my f 1 f px x p y y 1 x y 1 CAMERA PARAMETERS • Intrinsic parameters • Principal point coordinates Focal length Pixel magnification factors Skew (non-rectangular pixels) Radial distortion Extrinsic parameters Rotation and translation relative to world coordinate system CAMERA CALIBRATION x KR t X x * * * * y * * * * * * * * X Y Z 1 Source: D. Hoiem CAMERA CALIBRATION • Given n points with known 3D coordinates Xi and known image projections xi, estimate the camera parameters Xi xi P? CAMERA CALIBRATION: LINEAR METHOD x i PXi 0 x i PXi 0 T X i yi XTi XTi 0 xi XTi T xi P1 X i y PT X 0 i 2 i 1 P3T X i yi XTi P1 T xi X i P2 0 0 P3 Two linearly independent equations CAMERA CALIBRATION: LINEAR METHOD 0T T X1 T 0 XT n • • • • T 1 T X 0 XTn 0T y1X x1X P1 P2 0 T yn X n P3 xn XTn T 1 T 1 Ap 0 P has 11 degrees of freedom (12 parameters, but scale is arbitrary) One 2D/3D correspondence gives us two linearly independent equations Homogeneous least squares 6 correspondences needed for a minimal solution CAMERA CALIBRATION: LINEAR METHOD 0T T X1 T 0 XT n • T 1 T X 0 XTn 0T y1X x1X P1 P2 0 T yn X n P3 xn XTn T 1 T 1 Ap 0 Note: for coplanar points that satisfy ΠTX=0, we will get degenerate solutions (Π,0,0), (0,Π,0), or (0,0,Π) CAMERA CALIBRATION: LINEAR METHOD • • Advantages: easy to formulate and solve Disadvantages Doesn’t directly tell you camera parameters Doesn’t model radial distortion Can’t impose constraints, such as known focal length and orthogonality • Non-linear methods are preferred Define error as difference between projected points and measured points Minimize error using Newton’s method or other nonlinear optimization Source: D. Hoiem MULTI-VIEW GEOMETRY PROBLEMS • Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates of that point ? Camera 1 R1,t1 Camera 2 R2,t2 Camera 3 R3,t3 Slide credit: Noah Snavely MULTI-VIEW GEOMETRY PROBLEMS • Stereo correspondence: Given a point in one of the images, where could its corresponding points be in the other images? Camera 1 R1,t1 Camera 2 R2,t2 Camera 3 R3,t3 Slide credit: Noah Snavely MULTI-VIEW GEOMETRY PROBLEMS • Motion: Given a set of corresponding points in two or more images, compute the camera parameters Camera 1 R1,t1 ? Camera 2 R2,t2 ? ? Camera 3 R3,t3 Slide credit: Noah Snavely TRIANGULATION • Given projections of a 3D point in two or more images (with known camera matrices), find the coordinates of the point X? x1 O1 x2 O2 TRIANGULATION • We want to intersect the two visual rays corresponding to x1 and x2, but because of noise and numerical errors, they don’t meet exactly R1 R2 X? x1 O1 x2 O2 TRIANGULATION: GEOMETRIC APPROACH • Find shortest segment connecting the two viewing rays and let X be the midpoint of that segment X x1 O1 x2 O2 TRIANGULATION: LINEAR APPROACH 1 x1 P1X 2 x 2 P2 X x1 P1X 0 [x 1 ]P1X 0 x 2 P2 X 0 [x 2 ]P2 X 0 Cross product as matrix multiplication: 0 a b az a y az 0 ax a y bx a x by [a ]b 0 bz TRIANGULATION: LINEAR APPROACH 1 x1 P1X 2 x 2 P2 X x1 P1X 0 [x 1 ]P1X 0 x 2 P2 X 0 [x 2 ]P2 X 0 Two independent equations each in terms of three unknown entries of X TRIANGULATION: NONLINEAR APPROACH Find X that minimizes d ( x1 , P1 X ) d ( x2 , P2 X ) 2 2 X? x’1 x1 O1 x’2 x2 O2 TWO-VIEW GEOMETRY EPIPOLAR GEOMETRY X x x’ • Baseline – line connecting the two camera centers • Epipolar Plane – plane containing baseline (1D family) • Epipoles = intersections of baseline with image planes = projections of the other camera center THE EPIPOLE Photo by Frank Dellaert EPIPOLAR GEOMETRY X x x’ • Baseline – line connecting the two camera centers • Epipolar Plane – plane containing baseline (1D family) • Epipoles = intersections of baseline with image planes = projections of the other camera center • Epipolar Lines - intersections of epipolar plane with image planes (always come in corresponding pairs) EXAMPLE: CONVERGING CAMERAS EXAMPLE: MOTION PARALLEL TO IMAGE PLANE EXAMPLE: MOTION PERPENDICULAR TO IMAGE PLANE EXAMPLE: MOTION PERPENDICULAR TO IMAGE PLANE EXAMPLE: MOTION PERPENDICULAR TO IMAGE PLANE e’ e Epipole has same coordinates in both images. Points move along lines radiating from e: “Focus of expansion” EPIPOLAR CONSTRAINT X x • x’ If we observe a point x in one image, where can the corresponding point x’ be in the other image? EPIPOLAR CONSTRAINT X X X x x’ x’ x’ • Potential matches for x have to lie on the corresponding epipolar line l’. • Potential matches for x’ have to lie on the corresponding epipolar line l. EPIPOLAR CONSTRAINT EXAMPLE EPIPOLAR CONSTRAINT: CALIBRATED CASE X x • • • x’ Assume that the intrinsic and extrinsic parameters of the cameras are known We can multiply the projection matrix of each camera (and the image points) by the inverse of the calibration matrix to get normalized image coordinates We can also set the global coordinate system to the coordinate system of the first camera. Then the projection matrix of the first camera is [I | 0]. EPIPOLAR CONSTRAINT: CALIBRATED CASE X = RX’ + t x’ x t R The vectors x, t, and Rx’ are coplanar EPIPOLAR CONSTRAINT: CALIBRATED CASE X x x [t ( R x)] 0 x’ xT E x 0 with E [t ]R Essential Matrix (Longuet-Higgins, 1981) The vectors x, t, and Rx’ are coplanar EPIPOLAR CONSTRAINT: CALIBRATED CASE X x x [t ( R x)] 0 • • • • • x’ xT E x 0 with E [t ]R E x’ is the epipolar line associated with x’ (l = E x’) ETx is the epipolar line associated with x (l’ = ETx) E e’ = 0 and ETe = 0 E is singular (rank two) E has five degrees of freedom EPIPOLAR CONSTRAINT: UNCALIBRATED CASE X x • • x’ The calibration matrices K and K’ of the two cameras are unknown We can write the epipolar constraint in terms of unknown normalized coordinates: ˆxT E xˆ 0 x K xˆ, x K xˆ EPIPOLAR CONSTRAINT: UNCALIBRATED CASE X x’ x xˆ E xˆ 0 T x F x 0 with T 1 F K EK T ˆx K 1 x xˆ K x 1 Fundamental Matrix (Faugeras and Luong, 1992) EPIPOLAR CONSTRAINT: UNCALIBRATED CASE X x’ x xˆ E xˆ 0 T • • • • • x F x 0 with T 1 F K EK T F x’ is the epipolar line associated with x’ (l = F x’) FTx is the epipolar line associated with x (l’ = FTx) F e’ = 0 and FTe = 0 F is singular (rank two) F has seven degrees of freedom THE EIGHT-POINT ALGORITHM x = (u, v, 1)T, x’ = (u’, v’, 1)T Minimize: N T 2 ( x F x ) i i i 1 under the constraint F33 = 1 THE EIGHT-POINT ALGORITHM • Meaning of error N T 2 ( x F x ) i i : i 1 • sum of Euclidean distances between points xi and epipolar lines F x’i (or points x’i and epipolar lines FTxi) multiplied by a scale factor Nonlinear approach: minimize d ( x , F x) d ( x, F N 2 i 1 2 i i i T xi ) PROBLEM WITH EIGHT-POINT ALGORITHM PROBLEM WITH EIGHT-POINT ALGORITHM Poor numerical conditioning Can be fixed by rescaling the data THE NORMALIZED EIGHT-POINT ALGORITHM (Hartley, 1995) • • • • Center the image data at the origin, and scale it so the mean squared distance between the origin and the data points is 2 pixels Use the eight-point algorithm to compute F from the normalized points Enforce the rank-2 constraint (for example, take SVD of F and throw out the smallest singular value) Transform fundamental matrix back to original units: if T and T’ are the normalizing transformations in the two images, than the fundamental matrix in original coordinates is TT F T’ COMPARISON OF ESTIMATION ALGORITHMS 8-point Normalized 8-point Nonlinear least squares Av. Dist. 1 2.33 pixels 0.92 pixel 0.86 pixel Av. Dist. 2 2.18 pixels 0.85 pixel 0.80 pixel FROM EPIPOLAR GEOMETRY TO CAMERA CALIBRATION • • • Estimating the fundamental matrix is known as “weak calibration” If we know the calibration matrices of the two cameras, we can estimate the essential matrix: E = KTFK’ The essential matrix gives us the relative rotation and translation between the cameras, or their extrinsic parameters