M I P Visual-Geometric 3-D Scene Reconstruction from Uncalibrated Image Sequences Multimedia Information Processing Reinhard Koch and Jan-Michael Frahm Tutorial at DAGM 2001, München Multimedia Information Processing Group Christian-Albrechts-University of Kiel Germany {rk | jmf}@mip.informatik.uni-kiel.de www.mip.informatik.uni-kiel.de M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 1 Scene Reconstruction Method 1 • use ruler to measure scene geometry 1 3 ez 2 2 3 ey 1 1 + precize, absolute geometry 2 - tedious manual task - difficult to incorporate visual appearance M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 3 ex 2 Scene Reconstruction Method 2 • measure scene with camera, using image projections! ez ey + better to automate + incorporate images for appearance - how can it be done ? M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction ex 3 Goal of this tutorial Computer vision enables us to reconstruct highly naturalistic computer models of 3D enviroments from camera images We may need to extract the camera geometry (calibration), scene structure (surface geometry) as well as the visual appearance (color and texture) of the scene This tutorial will – – – – M I P CAU Multimedia Information Processing Kiel introduce the basic mathematical tools (projective geometry) derive models for cameras, image mappings, 3D structure give examples for image-based panoramic modeling explain geometric and visual models of 3D scene reconstruction DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 4 Outline of Tutorial 1. Basics on affine and projective geometry 2. Image mosaicing and panoramic reconstruction Coffee break 3. 3-D scene reconstruction from multiple views 4. Plenoptic modeling Demonstrations on mosaicing and 3-D modeling M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 6 Part 1: Basics on affine and projective geometry • Affine geometry – affine points and homogeneous coordinates – affine transformations • Projective geometry – projective points and projective coordinates – projective transformations • Pinhole camera model – Projection and sensor model – camera pose and calibration matrix • Image mapping with planar homographies M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 7 Affine coordinates M ei: affine basis vectors o: coordinate origin e3 G M a3 Vector relative to o: G G G G M = a1e1 + a2e2 + a3e3 e2 a2 o e1 a1 Point in affine coordinates: G G G G G G M = M + o = a1e1 + a2e2 + a3e3 + o Vector: relative to some origin Point: absolute coordinates M I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 8 Homogeneous coordinates M Unified notation: include origin in affine basis e3 a3 e2 a2 o a1 Affine basis matrix M I P CAU Multimedia Information Processing Kiel e1 G G G e1 e2 e3 M= 0 0 0 Homogeneous Coordinates of M a1 G o a2 1 a3 1 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 9 Euclidean coordinates Euclidean coordinates: affine basis is orthonormal 0 G G e3 = ez = 0 1 0 G o = 0 0 0 G G e2 = ey = 1 0 1 G G e1 = ex = 0 0 G G G e1 e2 e3 Euclidean basis: 0 0 0 M I P CAU Multimedia Information Processing Kiel 1 for i = k eiT ek = 0 for i ≠ k 1 G o 0 = 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 10 Metric coordinates Metric coordinates: affine basis is orthogonal 0 G G e3 = ez = 0 s 0 G o = 0 0 0 G G e2 = ey = s 0 s G G e1 = ex = 0 0 G G G e1 e2 e3 Metric basis: 0 0 0 M I P CAU Multimedia Information Processing Kiel s for i = k , s ≠ 0 eTi ek = 0 for i ≠ k 1 G 0 o = s 0 1 0 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 0 0 0 1 0 0 0 1 0 0 0 1 11 Affine Transformation Transformation Taffine combines linear mapping and coordinate shift in homogeneous coordinates – Linear mapping with A3x3 matrix – coordinate shift with t3 translation vector M Taffine M’ t3 A3 x 3 M ′ = TaffineM = M 0 0 0 1 Linear mapping with A3x3: o t3 M I P CAU Multimedia Information Processing Kiel o o’ A3x3 - Euclidean rigid rotation - metric isotropic scaling - affine skew and anisotropic scaling - preserves parallelism and length ratio DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 12 Properties of affine transformation t3 A3 x 3 ′ = = M TaffineM M 0 0 0 1 Taffine a11 a12 a a = 21 22 a31 a32 0 0 a13 a23 a33 0 tx ty tz 1 • Parallelism is preserved • ratios of length, area, and volume are preserved • Transformations can be concatenated: if M1 = T1M and M 2 = T2M1 ⇒ M 2 = T2T1M = T21M M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 13 Special transformation: Rotation TRotation R3 x 3 = 0 0 0 0 r11 r12 0 r21 r22 = 0 r31 r32 1 0 0 r13 r23 r33 0 0 0 0 1 • Rigid transformation: Angles and length preserved • R is orthonormal matrix defined by three angles around three coordinate axes ez ey α ex M I P CAU Multimedia Information Processing cos α Rz = sinα 0 − sinα 0 cosα 0 0 1 Rotation with angle α around ez DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 15 Special transformation: Rotation • Rotation around the coordinate axes can be concatenated: ez ey α β γ ex R = Rz R y R x Inverse of rotation matrix: −1 R =R M I P CAU Multimedia Information Processing Kiel T 0 1 Rx = 0 cos γ 0 sin γ 0 − sin γ cos γ cos β Ry = 0 − sin β 0 sin β 1 0 0 cos β cos α Rz = sinα 0 − sinα 0 cosα 0 0 1 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 16 Projective geometry • Projective space (3 is space of rays emerging from O – view point O forms projection center for all rays – rays v emerge from viewpoint into scene – ray g is called projective point, defined as scaled v: g=λv w (3 x′ v = y ′ 1 v y O M I P CAU Multimedia Information Processing x g (λ ) = y = λv , λ ∈ ℜ ≠ 0 w x DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 17 Projective and homogeneous points • Given: Plane Π in *2 embedded in (3 at coordinates w=1 – viewing ray g intersects plane at v (homogeneous coordinates) – all points on ray g project onto the same homogeneous point v – projection of g onto Π is defined by scaling v=g/λ = g/w w Π(*2) w=1 x′ x g (λ ) = λv = λ y ′ = y 1 w (3 v y O M I P CAU Multimedia Information Processing Kiel x DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction x w x′ y 1 v = y ′ = g = w w 1 1 18 Finite and infinite points • All rays g that are not parallel to Π intersect at an affine point v on Π. • The ray g(w=0) does not intersect Π. Hence v∞ is not an affine point but a direction. Directions have the coordinates (x,y,z,0)T • Projective space combines affine space with infinite points (directions). g (3 Π(*2) w=1 g1 g2 v v1 v2 O M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction v∞ x g∞ = y 0 19 Relation between affine and projective points • Affine space is embedded into projective space (x,y,z,w)T as hyperplane Π =(x,y,z,1)T. • Homogeneous coordinates map a projective point onto the affine hyperplane by scaling to w=1. • Projective points with w ≠0 are projected onto Π by scaling with 1/w. • Projective points with w=0 form the infinite hyperplane Π∞ = (x,y,z,0)T. They are not part of Π. M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 20 Affine and projective transformations • Affine transformation leaves infinite points at infinity X ′ a11 a12 ′ Y a a ⇒ = 21 22 Z ′ a31 a32 0 0 0 M∞' = TaffineM∞ a13 a23 a33 0 tx X t y Y tz Z 1 0 • Projective transformations move infinite points into finite affine space M ' = Tprojective M∞ tx X a11 a12 a13 xp X ′ ′ a a a t y Y 21 22 23 y p Y ⇒ =λ =λ a31 a32 a33 zp Z′ tz Z 1 w ′ w 41 w 42 w 43 w 44 0 Example: Parallel lines intersect at the horizon (line of infinite points). We can see this intersection due to perspective projection! M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 21 Pinhole Camera (Camera obscura) Interior of camera obscura (Sunday Magazine, 1838) Camera obscura (France, 1830) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 22 Pinhole camera model aperture View direction object image M I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 23 Pinhole camera model Im es g a r so n e aperture r nte ) e C , cy (c x View direction object Focal length f image M I P CAU Multimedia Information Processing Kiel lens DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 24 Perspective projection • Perspective projection in (3 models pinhole camera: – scene geometry is affine *3 space with coordinates M=(X,Y,Z,1)T – camera focal point in O=(0,0,0,1)T, camera viewing direction along Z – image plane (x,y) in Π(*2) aligned with (X,Y) at Z= Z0 – Scene point M projects onto point Mp on plane surface X Y M= Z 1 Z (Optical axis) (3 Z0 Π(*2) y x Image plane Mp X x p Z0ZX Y y Z0Y Z p 0 M p = = ZZ0Z = Z0 Z Z Z Z 1 1 Z0 Y O Camera center M I P CAU Multimedia Information Processing X DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 25 Projective Transformation • Projective Transformation maps M onto Mp in *3 space ρ Mp Y O X M xp X 1 0 y Y ρ M p = TpM ⇒ ρ p = = Z0 Z 0 0 1 W ρ= 0 X 0 Y 1 0 Z 1 0 z0 1 0 0 1 0 0 0 Z = projective scale factor Z0 • Projective Transformation linearizes projection M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 26 Perspective Projection Dimension reduction from *3 into *2 by projection onto Π(*2) M xp 1 1 x p 0 y y p = 0 Z0p 0 1 0 1 0 mp Π (*2) Y O 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 xp 0 0 y p 0 ⇒ mp = DpM p 0 Z0 1 1 1 X Perspective projection P0 from *3 onto *2: x 1 0 0 ρ mp = DpTpM = P0M ⇒ ρ y = 0 1 0 1 0 0 z10 M I P CAU Multimedia Information Processing Kiel X 0 Y Z 0 , ρ = Z Z0 0 1 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 27 Projection in general pose P Ro ta tio n mp [R Projection center C R Tcam = T 0 ] Projection: ρ mp = PM M C 1 −1 cam Tscene = T P0 RT = T 0 −RT C 1 World coordinates M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 28 Image plane and image sensor • A sensor with picture elements (Pixel) is added onto the image plane Pixel coordinates m = (y,x)T y x Pixel scale f= (fx,fy)T Z (Optical axis) Image center c= (cx, cy)T Image sensor Focal length Z0 Y X Image-sensor mapping: m = Kmp Projection center • Pixel coordinates are related to image coordinates by affine transformation K with five parameters: fx s c x K = 0 fy c y – Image center c at optical axis 0 0 1 – distance Zp (focal length) and Pixel size scales pixel resolution f – image skew s to model angle between pixel rows and columns M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 29 Projection matrix P • Camera projection matrix P combines: – inverse affine transformation Ta-1 from general pose to origin – Perspective projection P0 to image plane at Z0 =1 – affine mapping K from image to sensor coordinates scene pose transformation: Tscene 1 0 0 0 projection: P0 = 0 1 0 0 = [I 0] 0 0 1 0 R T = T 0 −RT C 1 fx s c x sensor calibration: K = 0 fy c y 0 0 1 ⇒ ρ m = PM, P = KP0Tscene = K RT M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction −RT C 30 The planar homography H • The 2D projective transformation Hik is a planar homography – maps any point on plane i to corresponding point on plane k – defined up to scale (8 independent parameters) – defined by 4 corresponding points on the planes with not more than any 2 points collinear plane k mk Hik mi mk H ik plane i mi M I P CAU Multimedia Information Processing h1 Hik = h4 h7 h2 h5 h8 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel h3 h6 1 31 Estimation of H from image correspondences • Hik can be estimated linearily from corresponding point pairs: – select 4 corresponding point pairs, if known noise-free – select N>4 corresponding point pairs, if correspondences are noisy – compute H such that correspondence error f is minimized Projective mapping: xk h1xi + h2 y i + h3 mk = ρk y k = Hmi = h4 xi + h5 y i + h6 w h7 xi + h8 y i + 1 Error functional f: N f = ∑ ( mk ,n − Hik mi ,n )2 ⇒ min! n =0 M I P CAU Multimedia Information Processing Kiel Image coordinate mapping: xk = h1xi + h2 y i + h3 h7 xi + h8 y i + 1 yk = h4 xi + h5 y i + h6 h7 xi + h8 y i + 1 h1 Hik = h4 h7 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction h2 h5 h8 h3 h6 1 32 Image mapping with homographies • Homographies are 2D projective transformations H3x3 • Homographies map points between planes • 2D homographies can be used to map images between different camera views for three equivalent cases: – (a) all cameras share the same view point Ci = C, or – (b) all scene points are at (or near to) infinity, or – (c) the observed scene is planar. • The 2D homography is independent of 3D scene structure! M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 33 (a) Image mapping for single view point • Camera with fixed projection center: Ci =C • Camera rotates freely with Ri and changing calibration Ki G MC = [I X − Cx −C ]M = Y − CY Z − CZ ρ i mi = Pi M = K i RiT M ρk mk mi ρi −RiT Ci M G T T = K i Ri [I −C ]M = K i Ri MC G T T ρk mk = K k Rk [I −C ]M = K k Rk MC G ⇒ MC = Ri K i−1ρ i mi = Rk K k−1ρ k mk ρ k mk = K k Rk−1Ri K i−1ρ i mi = ρ i Hik mi Y C X • Hik is a planar projective 3x3 transformation that maps points mi on plane i to points mk on plane k M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 34 (b) Image mapping for infinite scene with M∞ • All scene points are at or near infinity: M∞ are points on Π∞ • Camera rotates freely with Ri and changing calibration Ki M∞ X − 0 −Ci ]M∞ = Y − 0 Z − 0 G M∞ = [I M∞ mk mi Y Ci X Ck G M ∞ = Ri K i−1ρ i mi = Rk K k−1ρ k mk M I P CAU Multimedia Information Processing Kiel ⇒ ρ k mk = K k Rk−1Ri K i−1ρ i mi = ρ i Hik mi DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 35 (c) Image mapping of planar scene ΠM • All scene points are at on plane ΠM • Camera is completely free in K,R,C M HiM ΠM H kM H ik mk mi Y Ci X Transfer between images i,k over ΠM: M I P CAU Multimedia Information Processing Kiel Ck −1 Hik = HiM H kM DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 36 Part 2: Mosaicing and Panoramic Images • why mosaicing? • geometries constraints for mosaicing • mosaicing • projective mainfolds • stereo mosaicing M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 37 Why mosaicing? • cameras field of view is always smaller than human field of view • large objects can‘t be captured in a single picture Solutions • devices with wide field of view - fish-eye lenses (distortions, decreases quality) - hyper- and parabolic optical devices (lower resolution) • image mosaicing M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 38 What is mosaicing? definition of mosaicing : Matching multiple images by aligning and pasting images to a wilder field of view image. Features that continue over the images must be "zipped" together, and the frame edges dissolved. mosaicing Kang et. al. (ICPR 2000) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 39 Geometries of mosaic aquisition arbitrary scene planar scene rotating camera M I P CAU Multimedia Information Processing Kiel translating and rotating camera DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 40 Steps of mosaicing M • Image alignment: estimation of homography H for each image pair • Image cut and paste: selection of colorvalue for each mosaic pixel • Image blending: overcome of intensity differences between images I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 41 Foreward mapping to plane M pixel to direction (P0Tscene ) −1 mM −1 scene −1 m =T K N mp = M∞ mp I −1 0T K m direction to mosaic ρ mM = K mosaic P0Tmosaic M∞ I T 0 usually Tmosaic = C 0 , K mosaic = K 1 T −1 K ⇒ ρ mM = KR m H M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 42 Forward mapping to plane M I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 43 Forward mapping to sphere pixel to direction M A0 (P0Tscene ) −1 mM = M∞ mp I −1 −1 −1 m = K T N scene T K m mp 0 direction to polar X ϕ = tan Z −1 ϕ Y ϑ = tan 2 2 X +Z −1 C M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 44 Planar scene mapping pixel to direction M (P0Tscene ) −1 mM −1 scene K m =T N −1 mp = M∞ I −1 0T K m direction to mosaic mp ρ mM = K mosaic P0Tmosaic M∞ usually Tmosaic C M I P CAU Multimedia Information Processing Kiel I = T 0 0 , K mosaic = K 1 ⇒ ρ mM = KP0 RT − RT C P0−1K −1m DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 45 Backward mapping from plane M mosaic to direction −1 M∞ = (P0Tmosaic ) K mM mosaic −1 mM M,p m −1 mosaic =T I −1 0T K mM mM , p direction to pixel ρ m = KP0TsceneM∞ C usually Tmosaic I = T 0 0 , K mosaic = K 1 −1 ⇒ ρ m = KRK mM H −1 M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 46 Backward mapping from plane M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 47 mosaics image acquisition • pure rotation around optical center • planar scene translating and rotating camera problem • ensure camera motion constaints mosaic demo Live-Demonstration of Quicktime-VR Rotation-Panorama Neo-Gothic Building (Leuven, Belgium) Rudy Knoops, AVD, K.U. Leuven M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 48 Manifold projection • simulating sampling of scene with a 1-dim. sensor array • sensor can arbitrary translated and rotated in arbitrary scenes M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 49 Manifold projection planar scene rotating camera pure translating camera rotating and translating camera M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 50 Mosaic from strips In In-1 In+1 • stripes F(x,y)=0 perpendicular to optical flow ∂F ∂F ⇒ normal , ∂x ∂y T of stripe is parallel to optical flow • select stripe as close as possible to image center M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 51 Manifold projection • • • • • overcomes some of the difficulties of mosaicing defined for any camera motion and scene structure resolution is the same as image resolution simplified computation (real time) visually good results Problem • No metric in mosaic • Life-Demonstration VideoBrush M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 53 Stereo mosaics • multiple viewpoint panorama • for each eye a seperate multiple viewpoint panorama • constructed from one rotating camera M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 54 Stereo perception right eye rotating head circular projection left eye image plane M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 55 The camera C mosaicing C slit image plane C rotation about A slit C A A M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 56 Using video camera • rotate videocamera in the same way as split camera • use two vertical image strips inplace of splits C A M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 57 Stereo panorama S. Peleg (CVPR `99) S. Peleg (CVPR `99) Ommnistereo demo © OmniStereo Cooporation, 2000 M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 58 Part 3: 3-D scene reconstruction from multiple views • • • • • M I Projective and metric reconstruction 2-view epipolar constraint camera tracking from multiple views stereoscopic depth estimation 3-D surface modeling from multiple views P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 59 Scene reconstruction with projective cameras • Calibrated Camera: K known, Pose (R,C) unknown (metric camera ) P = K [RT | −RT C ] • Uncalibrated camera: K,R,C unknown (projective camera ) P = [KRT | −KRT C ] = [B | a] Reconstruction from multiple projective cameras: • unknown: Scene points M, projection matrices P • known: image projections mi of scene points Mi • problem: reconstruction is ambiguous in projective space! -1 m i PM i P (T T )M i (PT -1 )(TM i ) PM i • scene is defined only up to a projective Transformation T • camera is skewed by inverse Transformation T-1 M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 60 Ambiguity in projective reconstruction Euclidean scene T Projective reconstruction Valid projective reconstructions M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 61 Self-Calibration • Recover metric structure from projective reconstruction • Use constraints on the calibration matrix K • Utilize invariants in the scene to estimate K Apply self-calibration to recover T-1 T-1 Projective reconstruction M I P CAU Multimedia Information Processing Kiel Metric reconstruction DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 62 Camera Self-Calibration from H • Estimation of H between image pairs gives complete projective mapping (8 parameter). • Problem: How to compute camera projection matrix from H – since K is unknown, we can not compute R – H does not use constraints on the camera (constancy of K or some parameters of K) • Solution: self-calibration of camera calibration matrix K from image correspondences with H • imposing constraints on K may improve calibration Interpretation of H for metric camera: M I P CAU Multimedia Information Processing Kiel h1 H = h4 h7 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction h2 h5 h8 h3 h6 = K k Rk−1Ri K i−1 1 63 Self-calibration of K from H • Imposing structure on H can give a complete calibration from an image pair for constant calibration matrix K homography Hik = K k Rk−1Ri K i−1 relative rotation: Rk−1Ri = Rik , constant camera: K i = K k = K Hik = KRik K −1 ⇒ Rik = K −1Hik K since Rik−1 = RikT ⇒ Rik = Rik−T ⇒ Rik = K −1Hik K = K T Hik−T K −T ⇒ KK T = Hik (KK T )HikT • Solve for elements of (KKT) from this linear equation, independent of R • decompose (KKT) to find K with Choleski factorisation • 1 additional constraint needed (e.g. s=0) (Hartley, 94) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 64 Self-calibration for varying K • Solution for varying calibration matrix K possible, if – at least 1 constraint from K is known (s= 0) – a sequence of n image homographies H0i exist ⇒ Rik = K 0−1H0 i K i = K 0T H0−iT K i−T homography H0 i = K 0R0−1Ri K i−1 ⇒ K i K iT = H0 i (K 0K 0T )H0Ti n −1 solve by minimizing constraint ⇒ ∑ K i K iT − H0 i (K 0K 0T )H0Ti 2 ⇒ min! i =1 • Solve for varying K (e.g. Zoom) from this equation, independent of R • 1 additional constraint needed (e.g. s=0) • different constraints on Ki can be incorporated (Agapito et. al., 01) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 65 Multiple view geometry Projection onto two views: P0 = K 0R0−1 [I 0] P1 = K1R1−1 [I ρ0 m0 = P0M = K 0R0−1 [I 0]M ρ1m1 = P1M = K1R1−1 [I −C1 ]M ⇒ ρ0 m0 = K 0R0−1 [I 0]M∞ = K1R1−1 [I 0]M∞ + K1R1−1 [I ⇒ ρ1m1 = ρ0H∞ m0 + e1 ρ0M∞ M∞ H m ∞ 0 Z m0 m1 Y O P1 e1 X C1 M I P CAU Multimedia Information Processing Kiel −C1 ]O ρ1m1 = K1R1−1R0 K 0−1ρ0 m0 − K1R1−1C1 M P0 −C1 ] Epipolar line X X 0 0 Y Y M = = + = M∞ + O Z Z 0 1 0 1 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 66 The Fundamental Matrix F • The projective points e1 and (H∞m0) define a plane in camera 1 (epipolar plane !e) • the epipolar plane intersect the image plane 1 in a line (epipolar line le) • the corresponding point m1 lies on that line: m1Tle= 0 • If the points (e1),(m1),(H∞m0) are all collinear, then the collinearity theorem applies: collinearity of m1, e1, H∞ m0 ⇒ m1T ([e1 ]x H∞ m0 ) = 0 [ ]x = cross operator Fundamental Matrix F F = [e1 ]x H ∞ M I P CAU Multimedia Information Processing Kiel F3 x 3 Epipolar constraint m1T Fm0 = 0 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 67 The Fundamental Matrix F l1 = Fm0 m l =0 T 1 1 m1T Fm0 = 0 F = [e]xHa = Fundamental Matrix P0 L m0 M Ma l1 Epipole e1 m1 e1T F = 0 m1a =Ham0 P1 M I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 68 Estimation of F from image correspondences • Given a set of corresponding points, solve linearily for the 9 elements of F in projective coordinates • since the epipolar constraint is homogeneous up to scale, only eight elements are independent • since the operator [e]x and hence F have rank 2, F has only 7 independent parameters (all epipolar lines intersect at e) • each correspondence gives 1 collinearity constraint => solve F with minimum of 7 correspondences for N>7 correspondences minimize distance point-line: m Fm0 i = 0 T 1i M I P CAU Multimedia Information Processing Kiel N ∑ (m n =0 T 1,n Fm0,n )2 ⇒ min! DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 69 Estimation of P from F • From F we can obtain a camera projection matrix pair: – Set P0 to identity – compute P1 from F (up to projective transformation) – reduce projective skew by initial estimate of K to obtain a quasieuclidean estimate (Pollefeys et.al., ‘98) – compute self-calibration to obtain metric K similar to self-calibration from H (Pollefeys et.al, ‘99, Fusiello ‘00) Fundamental matrix m1T Fm0 = 0 e1T F = 0 M I P CAU Multimedia Information Processing ↔ Projective camera P0 = [I | 0] P1 = [[e1 ]x F + e1aT | e] DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 70 3D Feature Reconstruction • corresponding point pair (m0, m1) is projected from 3D feature point M • M is reconstructed from by (m0, m1) triangulation • M has minimum distance of intersection 2 d ⇒ min! P0 m0 l0 m1 F M I P CAU Kiel constraints: d l0T d = 0 l1T d = 0 l1 minimize reprojection error: P1 Multimedia Information Processing M (m0 − P0M) + (m1 − P1M) 2 DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 2 ⇒ min. 71 Multi View Tracking • 2D match: Image correspondence (m1, mi) • 3D match: Correspondence transfer (mi, M) via P1 • 3D Pose estimation of Pi with mi - Pi M => min. P0 M m0 3D match mi m1 P1 N Minimize lobal reprojection error: K ∑∑ m i =0 k =0 M I P CAU Multimedia Information Processing Kiel Pi 2D match k ,i − Pi Mk 2 ⇒ min! DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 72 Structure from motion: an example Image Sequence M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 73 Extraction of image features m2 Image 2 m1 Image 1 • features m1,2 (Harris Cornerdetector) • Select candidates based on cross correlation • Test candidates M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 74 Robust selection of correspondences RANSAC (RANdom Sampling Consensus) Hypothesis test: • For N Trials DO: – random selection of possible correspondences (minimum set) – compute F – test all correspondences [inliers/outliers] UNTIL ( Probability[#inliers, #Trials] > 95%) • Refine F with ML-estimate M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 75 Estimation of Fundamental Matrix Image 2 m 2T F m 1 = 0 Image 1 Robust correspondence selection m1 <-> m2 M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 76 Camera and feature tracking reconstruction of 3D features and cameras M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 77 3D surface modeling • camera calibration from feature tracking • dense depth estimation from stereo correspondence • depth fusion and generation of textured 3D surface model Image sequence M I P CAU Multimedia Information Processing Kiel camera calibration scene geometry 3D surface model DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 78 Dense depth estimation • use of constraints along epipolar line (ordering, uniqueness) • matching with normalized cross correlation • fast matching in rectified image pair Epipolar line search M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 79 Image rectification with planar mappings map between images estimate disparity D mr = Hr−1mr' = Hr−1Dlr ml' = Hr−1Dlr Hl ml 1 0 −d Dlr (ml' , mr' ) = 0 1 0 0 0 1 M depth ρ 1 d M∞ ρ mr' = Dlr ml' ' l m mr' d ml Cl M I P CAU Multimedia Information Processing Kiel Hl Hr mr Cr DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 80 Dense search for D • use of constraints along epipolar line (ordering, uniqueness) • matching with normalized cross correlation • constrained epipolar search with dynamic programming Epipolar lines M I P CAU Multimedia Information Processing Kiel Search path for constrained matching DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 81 Depth map Originals Depth map 7(m) shaded surface model Problem: Triangulation angle limits depth resolution M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 82 The Baseline problem • small baseline stereo: + Correspondence is simple (images are similar) + few occluded regions - large depth uncertainty Large depth uncertainty C2 C1 Small camera baseline (small triangulation angle) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 83 The Baseline problem • wide baseline stereo: - Correspondence is difficult (images are not similar) - many occluded regions + small depth uncertainty Small depth error C2 C1 Large camera baseline (large triangulation angle) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 84 Multi-Viewpoint Depth Fusion • Concatenate correspondences over adjacent image pairs • depth triangulation along line of sight • depth fusion of all triangulated correspondences, remove outliers Depth fusion M I P CAU Multimedia Information Processing Kiel Detection of outliers DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 85 Depth uncertainty after sequence integration • Improved density, fewer occlusions • Depth error decreases • Improved surface geometry M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 86 3D surface Modeling Depth map surface mesh er nd re in g ing p p ma Texture map M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 87 Jain Tempel (Ranakpur, Indien) Images M I P CAU Multimedia Information Processing Kiel 3D-model DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 88 Sagalassos: Virtual Museum M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 89 Open Questions from geometric modeling • How much geometry is really needed for visualisation ? • Trade-off: modeling vs. texture mapping (reflectance, microstructure) • how can we model surface reflections ? • How can we efficiently store and render the scenes ? M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 90 Part 4: Plenoptic Modeling • • • • M I P CAU Multimedia Information Processing Kiel The lightfield depth-dependent view interpolation The uncalibrated Lumigraph Visual-geometric modeling: view-dependent interpolation and texture mapping DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 91 Modeling of scenes with surface reflections • Problem when modeling scenes with reflections – view dependent reflections can not be handled by single texture map – may not be able to recover geometry • Approach: Lightfield rendering M I P CAU Multimedia Information Processing DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction Kiel 92 Image-based Acquisition: The Lightfield View ray v (u,v,s,t) u t focal plane (images) s 3D scene viewpoint plane (cameras) 4-D Lightfield Data structure: Grid of cameras (s,t) and grid of images (u,v) store all possibe surface reflections of the scene M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 93 Image-based Rendering from Lightfield Images v u Interpolated view t (u,v) grid (s,t) grid s Interpolation errors (ghosting) due to unmodelled geometry! M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 94 Depth-dependent Rendering Error Si 3D object Si+1 (s,t)-plane M I P CAU Multimedia Information Processing Kiel (u,v)-plane DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 95 Combining Lightfield and Structure from Motion Lightfield (LF): – needs camera calibration and depth estimation + efficient rendering of view-dependent scenes Structure from Motion (SFM): + delivers camera calibration and depth estimation – can not handle view-dependent scenes Uncalibrated Lumigraph: + tracking and scene structure with SFM + view-dependent rendering with generalized LF M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 96 Uncalibrated plenoptic modeling Plenoptic modeling: Build an array of images by concatenating views from an uncalibrated, freely moving moving video camera Free-moving video camera 3-D Scene Plenoptic image array Plenoptic rendering: Render novel views of the observed scene by image interpolation from the plenoptic image array M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 97 Plenoptic Modeling from uncalibrated Sequence Office sequence: sweeping a camera freely over cluttered desk environment Input sequence M I P CAU Multimedia Information Processing Kiel Viewpoint surface mesh calibration DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 98 Depth Maps and geometry Calibrated image pairs M I P CAU Multimedia Information Processing Kiel local depth maps fused geometry DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 99 3D surface model Calibrated image pairs M I P CAU Multimedia Information Processing Kiel local depth maps 3D surface model DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 100 Viewpoint-Adaptive Rendering • Render images directly from calibrated camera viewpoints • Interpolate by approximating local scene geometry View-dependent geometric approximation Novel view Camera viewpoint surface object surface M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 101 Projective texture mapping on planar scene Local surface plane M HiM ΠM H kM H ik mk Texture map i mi rendered image k Y Ck X −1 Transfer between images i,k over ΠM: Hik = HiM HkM M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 102 Viewpoint-Adaptive Rendering • Render images directly from calibrated camera viewpoints • Interpolate by approximating local scene geometry View-dependent geometric approximation Novel view Camera viewpoint surface object surface M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 103 Viewpoint-Adaptive Rendering • Render images directly from calibrated camera viewpoints • Interpolate by approximating local scene geometry • Scalable geometrical approximation through subdivision View-dependent geometric approximation Novel view Camera viewpoint surface object surface Rendering: 3 projective mappings and α-blending per triangle (OpenGL Hardware) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 104 Scalable Geometry for View Interpolation • Adaptation of geometric detail for Lightfield Interpolation • Geometry is computed online for every rendering viewpoint 1 Surface plane per viewpoint triangle 1 Subdivision per viewpoint triangle M I P CAU Multimedia Information Processing Kiel 2 Subdivisions 4 Subdivisions DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 105 Scalable Geometry Adaptation of Geometric detail to interpolation error M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 106 Viewpoint-Adaptive Geometry Adaptation of Geometry to viewpoint (2 subdivisions) M I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 107 Uncalibrated Lumigraph: Rendering Results Rendering from lightfield calibration with planar approximation M I P CAU Multimedia Information Processing Kiel Rendering with locally adapted geometry (viewpoint mesh 2 x subdivided) DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 108 Conclusions A complete framework for automatically reconstructing and rendering of 3D scenes from uncalibrated image sequences was presented. It combines structure from motion with Image-based rendering. Properties of the approach: • • • • M handle uncalibrated sequences from freely moving hand-held cameras calibrate lightfield sequences with viewpoint mesh connectivity exploit scalable geometric complexity rendering of surface reflections I P CAU Multimedia Information Processing Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 109