3D Geometry and Camera Calibration 3D Coordinate Systems • Right-handed vs. left-handed x y x z z y 2D Coordinate Systems • y axis up vs. y axis down • Origin at center vs. corner • Will often write (u, v) for image coordinates u v v u v u 3D Geometry Basics • 3D points = column vectors x p y z • Transformations = pre-multiplied matrices a b Tp d e g h c x f y i z Rotation • Rotation about the z axis cos R z sin 0 sin cos 0 0 0 1 • Rotation about x, y axes similar (cyclically permute x, y, z) Arbitrary Rotation • Any rotation is a composition of rotations about x, y, and z • Composition of transformations = matrix multiplication (watch the order!) • Result: orthonormal matrix – Each row, column has unit length – Dot product of rows or columns = 0 – Inverse of matrix = transpose Arbitrary Rotation • Rotate around x, y, then z: sin y cos z R sin y sin z cos y cos x sin z sin x cos y cos z cos x cos z sin x cos y sin z sin x sin y sin x sin z cos x cos y cos z sin x cos z cos x cos y sin z cos x sin y • Don’t do this! Compute simple matrices and multiply them! Scale • Scale in x, y, z: sx S 0 0 0 sy 0 0 0 s z Shear • Shear parallel to xy plane: 1 0 x σ xy 0 1 y 0 0 1 Translation • Can translation be represented by multiplying by a 33 matrix? • No. • Proof: A : A0 0 Homogeneous Coordinates • Add a fourth dimension to each point: x x y y z z w • To get “real” (3D) coordinates, divide by w: x x w y y z w z w w Translation in Homogeneous Coordinates 1 0 0 0 0 0 t x x x t x w 1 0 t y y y t y w 0 1 tz z z tz w 0 0 1 w w • After divide by w, this is just a translation by (tx , ty , tz) Perspective Projection • What does 4th row of matrix do? 1 0 0 0 • After divide, 0 0 0 x x 1 0 0 y y 0 1 0 z z 0 1 0 w z x x z y y z z 1 z Perspective Projection • This is projection onto the z=1 plane (x,y,z) (x/z,y/z,1) (0,0,0) z=1 • Add scaling, etc. pinhole camera model Putting It All Together: A Camera Model Scale to pixel size Translate to image center Camera orientation Perspective projection 3D point TimgS pixPcamR camTcam x Then perform homogeneous divide, and get (u,v) coords Camera location (homogeneous coords) Putting It All Together: A Camera Model Intrinsics Extrinsics TimgS pixPcamR camTcam x Putting It All Together: A Camera Model Camera coordinates Normalized device coordinates Eye coordinates Image coordinates Pixel coordinates World coordinates TimgS pixPcamR camTcam x More General Camera Model • Multiply all these matrices together • Don’t care about “z” after transformation a e i b f j c g k ax by cz d d x homogeneous ix jy kz l h y ex fy gz h ix jy kz l z divide l 1 • Scale ambiguity 11 free parameters Radial Distortion • Radial distortion can not be represented by matrix uimg cu u vimg cv v * img * img 1 k (u 1 k (u * 2 img * 2 img * 2 img v * 2 img v ) ) • (cu, cv) is image center, u*img= uimg– cu, v*img= vimg– cv, k is first-order radial distortion coefficient Camera Calibration • Determining values for camera parameters • Necessary for any algorithm that requires 3D 2D mapping • Method used depends on: – What data is available – Intrinsics only vs. extrinsics only vs. both – Form of camera model Camera Calibration – Example 1 • Given: – 3D 2D correspondences – General perspective camera model (11-parameter, no radial distortion) • Write equations: ax1 by1 cz1 d u1 ix1 jy1 kz1 l ex1 fy1 gz1 h v1 ix1 jy1 kz1 l Camera Calibration – Example 1 x1 0 x 2 0 y1 0 y2 z1 0 z2 1 0 1 0 x1 0 0 y1 0 0 z1 0 0 u1 x1 1 u1 x1 0 u 2 x2 u1 y1 u1 y1 u2 y2 u1 z1 u1 z1 u2 z2 0 0 0 x2 y2 z2 1 u 2 x2 u2 y2 u2 z2 u1 a u1 b u2 c 0 u 2 l • Linear equation • Overconstrained (more equations than unknowns) • Underconstrained (rank deficient matrix – any multiple of a solution, including 0, is Camera Calibration – Example 1 • Standard linear least squares methods for Ax=0 will give the solution x=0 • Instead, look for a solution with |x|= 1 • That is, minimize |Ax|2 subject to |x|2=1 Camera Calibration – Example 1 • Minimize |Ax|2 subject to |x|2=1 • |Ax|2 = (Ax)T(Ax) = (xTAT)(Ax) = xT(ATA)x • Expand x in terms of eigenvectors of ATA: x = m1e1+ m2e2+… xT(ATA)x = l1m12+l2m22+… |x|2 = m12+m22+… Camera Calibration – Example 1 • To minimize l1m12+l2m22+… subject to m12+m22+… = 1 set mmin= 1 and all other mi=0 • Thus, least squares solution is minimum (non-zero) eigenvalue of ATA Camera Calibration – Example 2 • Incorporating additional constraints into camera model – No shear, no scale (rigid-body motion) – Square pixels – etc. • These impose nonlinear constraints on camera parameters Camera Calibration – Example 2 • Option 1: solve for general perspective model, then find closest solution that satisfies constraints • Option 2: nonlinear least squares – Usually “gradient descent” techniques – Common implementations available (e.g. Matlab optimization toolbox) Camera Calibration – Example 3 • Incorporating radial distortion • Option 1: – Find distortion first (straight lines in calibration target) – Warp image to eliminate distortion – Run (simpler) perspective calibration • Option 2: nonlinear least squares Camera Calibration – Example 4 • What if 3D points are not known? • Structure from motion problem! • As we saw last time, can often be solved since # of knowns > # of unknowns Multi-Camera Geometry • Epipolar geometry – relationship between observed positions of points in multiple cameras • Assume: – 2 cameras – Known intrinsics and extrinsics Epipolar Geometry P p1 C1 p2 C2 Epipolar Geometry P l2 p1 C1 p2 C2 Epipolar Geometry P Epipolar line l2 p1 p2 C1 C2 Epipoles Epipolar Geometry • Goal: derive equation for l2 • Observation: P, C1, C2 determine a plane P l2 p1 C1 p2 C2 Epipolar Geometry • Work in coordinate frame of C1 • Normal of plane is T Rp2, where T is relative translation, R is relative rotation P l2 p1 C1 p2 C2 Epipolar Geometry • p1 is perpendicular to this normal: p1 (T Rp2) = 0 P l2 p1 C1 p2 C2 Epipolar Geometry • Write cross product as matrix multiplication T x T* x , 0 * T Tz T y P Tz 0 Tx l2 p1 C1 p2 C2 Ty Tx 0 Epipolar Geometry • p1 T* R p2 = 0 p1T E p2 = 0 • E is the essential matrix P l2 p1 C1 p2 C2 Essential Matrix • E depends only on camera geometry • Given E, can derive equation for line l2 P l2 p1 C1 p2 C2 Fundamental Matrix • Can define fundamental matrix F analogously, operating on pixel coordinates instead of camera coordinates u1T F u2 = 0 • Advantage: can sometimes estimate F without knowing camera calibration