Single-view geometry Odilon Redon, Cyclops, 1914 Geometric vision • Goal: Recovery of 3D structure • What cues in the image allow us to do this? Visual cues Shading Merle Norman Cosmetics, Los Angeles Slide credit: S. Seitz Visual cues Focus From The Art of Photography, Canon Slide credit: S. Seitz Visual cues Perspective Slide credit: S. Seitz Visual cues Motion Slide credit: S. Seitz Our goal: Recovery of 3D structure • We will focus on perspective and motion • We need multi-view geometry because recovery of structure from one image is inherently ambiguous X? x X? X? Our goal: Recovery of 3D structure • We will focus on perspective and motion • We need multi-view geometry because recovery of structure from one image is inherently ambiguous Our goal: Recovery of 3D structure • We will focus on perspective and motion • We need multi-view geometry because recovery of structure from one image is inherently ambiguous Recall: Pinhole camera model ( X ,Y , Z ) ( f X / Z , f Y / Z ) X f X f Y Z fY Z 1 f X 0 Y 0 Z 1 0 1 x PX Pinhole camera model f X f fY Z x PX f X 0 1 Y 1 0 Z 1 1 0 1 P diag ( f , f ,1)I | 0 Camera coordinate system • Principal axis: line from the camera center perpendicular to the image plane • Normalized (camera) coordinate system: camera center is at the origin and the principal axis is the z-axis • Principal point (p): point where principal axis intersects the image plane (origin of normalized coordinate system) Principal point offset principal point: ( px , p y ) • Camera coordinate system: origin is at the prinicipal point • Image coordinate system: origin is in the corner Principal point offset principal point: ( px , p y ) ( X , Y , Z ) ( f X / Z px , f Y / Z p y ) X f X Z px f Y Z f Y Z py Z 1 f px py 1 X 0 Y 0 Z 0 1 Principal point offset principal point: f X Zp x f f Y Zp y Z f K f f ( px , p y ) X p x 1 0 Y py 1 0 Z 1 1 0 1 px p y calibration matrix 1 P KI | 0 Pixel coordinates 1 1 Pixel size: mx m y mx pixels per meter in horizontal direction, my pixels per meter in vertical direction mx K my pixels/m f 1 f m px x p y y 1 pixels x y 1 Camera rotation and translation • In general, the camera coordinate frame will be related to the world coordinate frame by a rotation and a translation ~ ~ ~ Xcam R X - C coords. of point in camera frame coords. of camera center in world frame coords. of a point in world frame (nonhomogeneous) Camera rotation and translation In non-homogeneous coordinates: ~ ~ ~ Xcam R X - C X cam x KI | 0Xcam R 0 ~ ~ RC X R 1 1 0 ~ K R | RC X ~ RC X 1 P KR | t , ~ t RC Note: C is the null space of the camera projection matrix (PC=0) Camera parameters • Intrinsic parameters • • • • • Principal point coordinates mx f Focal length K my Pixel magnification factors 1 Skew (non-rectangular pixels) Radial distortion f px x p y y 1 x y 1 Camera parameters • Intrinsic parameters • • • • • Principal point coordinates Focal length Pixel magnification factors Skew (non-rectangular pixels) Radial distortion • Extrinsic parameters • Rotation and translation relative to world coordinate system Camera calibration • Given n points with known 3D coordinates Xi and known image projections xi, estimate the camera parameters Xi xi P? Camera calibration T xi P1 yi P2T X i 1 P3T x i PXi 0 T X i yi XTi XTi 0 T xi X i x i PXi 0 yi XTi P1 T xi X i P2 0 0 P3 Two linearly independent equations Camera calibration 0T T X1 T 0 XT n T 1 T X 0 XTn 0T y1X x1X P1 P2 0 T yn X n P3 xn XTn T 1 T 1 Ap 0 • P has 11 degrees of freedom (12 parameters, but scale is arbitrary) • One 2D/3D correspondence gives us two linearly independent equations • Homogeneous least squares • 6 correspondences needed for a minimal solution Camera calibration 0T T X1 T 0 XT n T 1 T X 0 XTn 0T y1X x1X P1 P2 0 T yn X n P3 xn XTn T 1 T 1 Ap 0 • Note: for coplanar points that satisfy ΠTX=0, we will get degenerate solutions (Π,0,0), (0,Π,0), or (0,0,Π) Camera calibration • Once we’ve recovered the numerical form of the camera matrix, we still have to figure out the intrinsic and extrinsic parameters • This is a matrix decomposition problem, not an estimation problem (see F&P sec. 3.2, 3.3) Two-view geometry • Scene geometry (structure): Given projections of the same 3D point in two or more images, how do we compute the 3D coordinates of that point? • Correspondence (stereo matching): Given a point in just one image, how does it constrain the position of the corresponding point in a second image? • Camera geometry (motion): Given a set of corresponding points in two images, what are the cameras for the two views? Triangulation • Given projections of a 3D point in two or more images (with known camera matrices), find the coordinates of the point X? x1 O1 x2 O2 Triangulation • We want to intersect the two visual rays corresponding to x1 and x2, but because of noise and numerical errors, they don’t meet exactly R1 R2 X? x1 O1 x2 O2 Triangulation: Geometric approach • Find shortest segment connecting the two viewing rays and let X be the midpoint of that segment X x1 O1 x2 O2 Triangulation: Linear approach 1 x1 P1X 2 x 2 P2 X x1 P1X 0 [x 1 ]P1X 0 x 2 P2 X 0 [x 2 ]P2 X 0 Cross product as matrix multiplication: 0 a b az a y az 0 ax a y bx a x by [a ]b 0 bz Triangulation: Linear approach 1 x1 P1X 2 x 2 P2 X x1 P1X 0 [x 1 ]P1X 0 x 2 P2 X 0 [x 2 ]P2 X 0 Two independent equations each in terms of three unknown entries of X Triangulation: Nonlinear approach Find X that minimizes d ( x1 , P1 X ) d ( x2 , P2 X ) 2 2 X? x’1 x1 O1 x’2 x2 O2