Cameras, lenses, and calibration • Camera models • Projection equations Images are projections of the 3-D world onto a 2-D plane… Light rays from many different parts of the scene strike the same point on the paper. Pinhole camera only allows rays from one point in the scene to strike each point of the paper. Forsyth&Ponce Pinhole camera geometry: Distant objects are smaller Perspective projection camera world f y z y’ Cartesian coordinates: We have, by similar triangles, that (x, y, z) -> (f x/z, f y/z, -f) Ignore the third coordinate, and get (x, y,z) ( f x y ,f ) z z Geometric properties of projection • • • • Points go to points Lines go to lines Planes go to whole image or half-planes. Polygons go to polygons • Degenerate cases – line through focal point to point – plane through focal point to line Perspective projection of that line Line in 3-space x(t ) x0 at y (t ) y0 bt z (t ) z0 ct In the limit as we have (for c fx f (x 0 at) x'(t) z z0 ct fy f (y 0 bt) y'(t) z z0 ct t 0 ): This tells us that any set of parallel lines (same a, b, c parameters) project to the same point (called the vanishing point). fa x'(t) c fb y'(t) c http://www.ider.herts.ac.uk/school/courseware/grap hics/two_point_perspective.html Vanishing points • Each set of parallel lines (=direction) meets at a different point – The vanishing point for this direction • Sets of parallel lines on the same plane lead to collinear vanishing points. – The line is called the horizon for that plane What if you photograph a brick wall head-on? y x Brick wall line in 3-space x(t ) x0 at y (t ) y0 Perspective projection of that line f (x 0 at) x'(t) z0 f y0 y'(t) z0 z (t ) z0 All bricks have same z0. Those in same row have same y0 Thus, a brick wall, photographed head-on, gets rendered as set of parallel lines in the image plane. Other projection models: Orthographic projection ( x, y , z ) ( x , y ) Other projection models: Weak perspective • Issue – perspective effects, but not over the scale of individual objects – collect points into a group at about the same depth, then divide each point by the depth of its group – Adv: easy – Disadv: only approximate fx fy ( x, y, z ) , z0 z0 Three camera projections 3-d point (1) Perspective: (2) Weak perspective: (3) Orthographic: 2-d image position fx fy ( x, y , z ) , z z fx fy ( x, y, z ) , z0 z0 ( x, y , z ) ( x, y ) Homogeneous coordinates • Is this a linear transformation? • no—division by z is nonlinear Trick: add one more coordinate: homogeneous image coordinates homogeneous scene coordinates Converting from homogeneous coordinates Slide by Steve Seitz Perspective Projection • Projection is a matrix multiply using homogeneous coordinates: 1 0 0 0 1 0 0 0 1/ f x 0 x x y y f , f 0 z z 0 z / f 1 y z This is known as perspective projection • The matrix is the projection matrix Slide by Steve Seitz Perspective Projection How does scaling the projection matrix change the transformation? x 0 x x y y y f , f 0 z z z 0 z / f 1 1 0 0 0 1 0 0 0 1/ f f 0 0 0 f 0 0 0 1 x 0 y 0 z 0 1 fx fy z x y f , f z z Slide by Steve Seitz Orthographic Projection Special case of perspective projection • Orthography is an approximate model for long focal length (telephoto) lenses and objects whose depth is shallow relative to their distance to the camera Image World • Also called “parallel projection” . What’s the projection matrix? ? Slide by Steve Seitz Orthographic Projection Special case of perspective projection Image World • Also called “parallel projection” • What’s the projection matrix? Slide by Steve Seitz Homogeneous coordinates 2D Points: x p y 2D Lines: x' p' y' w' x p' y 1 ax by c 0 x a b c y 0 1 x' /w' p y' /w' l a b c nx (nx, ny) d ny d Homogeneous coordinates Intersection between two lines: x12 a2 x b2 y c2 0 a1x b1y c1 0 l1 a1 b1 c1 l2 a2 b2 c2 x12 l1 l2 Homogeneous coordinates Line joining two points: p1 p2 ax by c 0 p1 x1 y1 1 p2 x 2 y 2 1 l p1 p2 2D Transformations 2D Transformations Example: translation tx = + ty 2D Transformations Example: translation tx = + ty = 1 0 tx 0 1 ty . 1 2D Transformations Example: translation tx = + ty = 1 0 tx 0 1 ty . = 1 0 tx 0 1 ty 0 0 1 . 1 Now we can chain transformations Translation and rotation, written in each set of coordinates r r r B B A B pA R p A t Non-homogeneous coordinates Homogeneous coordinates B where r B Ar pA C p B R B A C A 0 0 0 | r B A t | 1 Translation and rotation “as described in the coordinates of frame B” Let’s write iˆA r B r B Ar B pA R p A t A px py ĵ A as a single matrix equation: B px B B py A R B pz 0 0 0 1 A | B A t | 1 r p A px A p y A pz 1 A pz k̂ A B A r t Camera calibration Use the camera to tell you things about the world: – Relationship between coordinates in the world and coordinates in the image: geometric camera calibration, see Szeliski, section 5.2, 5.3 for references – (Relationship between intensities in the world and intensities in the image: photometric image formation, see Szeliski, sect. 2.2.) Intrinsic parameters: from idealized world coordinates to pixel values Forsyth&Ponce Perspective projection x u f z y v f z Intrinsic parameters But “pixels” are in some arbitrary spatial units x u z y v z Intrinsic parameters Maybe pixels are not square x u z y v z Intrinsic parameters We don’t know the origin of our camera pixel coordinates x u u0 z y v v0 z Intrinsic parameters v v u u v sin( ) v u u cos( )v u cot( )v May be skew between camera pixel axes x y u cot( ) u0 z z y v v0 sin( ) z Intrinsic parameters, homogeneous coordinates x y cot( ) u0 z z y v v0 sin( ) z u Using homogenous coordinates, we can write this as: u cot( ) u0 v0 v 0 sin( ) 1 0 0 1 or: In pixels r p K x 0 y 0 z 0 1 C In camera-based coords r p Extrinsic parameters: translation and rotation of camera frame C r C W r Cr pW R p W t C r C p W R 0 0 0 Non-homogeneous coordinates | r r C W p W t | 1 Homogeneous coordinates Combining extrinsic and intrinsic calibration parameters, in homogeneous coordinates r Cr p K p pixels Camera coordinates C r C p W R 0 0 0 r p K WC R 000 r p M Forsyth&Ponce W r p Intrinsic World coordinate | r r C W p W t | 1 C W r Wr t p 1 Extrinsic Other ways to write the same equation pixel coordinates world coordinates r p M W u . m1T T v . m2 T 1 . m 3 r p W px . .W py . . W pz . . 1 Conversion back from homogeneous coordinates leads to: m1 P u m3 P m2 P v m3 P Camera parameters A camera is described by several parameters • • • • Translation T of the optical center from the origin of world coords Rotation R of the image plane focal length f, principle point (x’c, y’c), pixel size (sx, sy) blue parameters are called “extrinsics,” red are “intrinsics” Projection equation sx * * * * x sy * * * * s * * * * X Y ΠX Z 1 • The projection matrix models the cumulative effect of all parameters • Useful to decompose into a series of operations identity matrix fsx 0 0 0 fsy 0 intrinsics x'c 1 0 0 0 R y'c 0 1 0 0 3x 3 0 0 0 1 0 1 1x 3 projection 0 3x1I 3x 3 1 01x 3 rotation 1 T 3x1 translation • The definitions of these parameters are not completely standardized – especially intrinsics—varies from one book to another Projection matrices for Orhographic and scaled orthographic projections Orthographic projection r1T t1 P r1T t 2 (5dof) 0 1 Scaled orthographic projection r1T t1 P r1T t 2 0 1 / k (6dof)