Geometry Slides (part 4) - Weizmann Institute of Science

Geometry 4: Multiview Stereo Introduction to Computer Vision Ronen Basri Weizmann Institute of Science Material covered • Pinhole camera model, perspective projection • Two view geometry, general case: • Epipolar geometry, the essential matrix • Camera calibration, the fundamental matrix • Two view geometry, degenerate cases • Homography (planes, camera rotation) • A taste of projective geometry • Stereo vision: 3D reconstruction from two views • Multi-view geometry, reconstruction through factorization Structure from motion • Input: • a set of point tracks • Output: • 3D location of each point (shape) • camera parameters (motion) • Assumptions: • Rigid motion • Orthographic projection (no scale) • Method: SVD factorization (Tomasi & Kanade) Setup • 𝐼1 , 𝐼2 , … , 𝐼𝑓 : a collection of images (video frames) depicting a rigid scene • 𝑝 point tracks in those 𝑓 frames: 𝑝𝑖𝑗 = (𝑥𝑖𝑗 , 𝑦𝑖𝑗 )𝑇 the location of 𝑃𝑗 at frame 𝑖 • Unknown 3D locations: 𝑃𝑗 = (𝑋𝑗 , 𝑌𝑗 , 𝑍𝑗 )𝑇 ∈ ℝ3 , 𝑗 = 1, … , 𝑝 • Therefore, 𝑥𝑖𝑗 = 𝒓𝑖 𝑇 𝑃𝑗 + 𝑐𝑖 𝑦𝑖𝑗 = 𝒔𝑖 𝑇 𝑃𝑗 + 𝑑𝑖 𝒓𝑖 𝑇 , 𝒔𝑖 𝑇 are the two top rows of a 3 × 3 rotation matrix Objective Find 𝒓𝑖 𝒔𝑖 ∈ ℝ3 and 𝑐𝑖 , 𝑑𝑖 ∈ ℝ that minimize 𝑓 𝑝 𝑇 (𝒓𝑖 𝑃𝑗 + 𝑐𝑖 ) − 𝑥𝑖𝑗 2 𝑇 + (𝒔𝑖 𝑃𝑗 + 𝑑𝑖 ) − 𝑦𝑖𝑗 𝑖=1 𝑗=1 Subject to 𝒓𝑖 = 𝒔𝑖 = 1 𝒓𝑖 𝑇 𝒔𝑖 = 0 2 Eliminate translation • We can eliminate translation by representing the location of each point relative to the centroids of all 𝑝 points • Assume without loss of generality that the centroid of 𝑃1 , … , 𝑃𝑝 coincides with the origin 𝟎 ∈ ℝ3 • Translate each image point by setting 𝑥𝑖𝑗 = 𝑥𝑖𝑗 − 𝑥𝑖 𝑦𝑖𝑗 = 𝑦𝑖𝑗 − 𝑦𝑖 (𝑥𝑖 , 𝑦𝑖 ) denotes the centroid of (𝑥𝑖𝑗 , 𝑦𝑖𝑗 ) Objective (no translation) Find 𝒓𝑖 𝒔𝑖 ∈ ℝ3 that minimize 𝑓 𝑝 𝑇 𝒓𝑖 𝑃𝑗 − 𝑥𝑖𝑗 2 𝑇 + 𝒔𝑖 𝑃𝑗 − 𝑦𝑖𝑗 𝑖=1 𝑗=1 Subject to 𝒓𝑖 = 𝒔𝑖 = 1 𝒓𝑖 𝑇 𝒔𝑖 = 0 2 Measurement matrix 𝑀= 𝑥11 … 𝑥𝑓1 𝑦11 .. 𝑦𝑓1 𝑥12 . . . 𝑥𝑓2 𝑦12 . . . . . . 𝑦𝑓2 . . . 𝑥1𝑝 … 𝑥𝑓𝑝 𝑦1𝑝 … 𝑦𝑓𝑝 2𝑓×𝑝 Transformation and shape matrices 𝑇= 𝒓1 𝑇 … 𝒓𝑓 𝑇 𝒔1 𝑇 … 𝒔𝑓 𝑇 𝑋1 𝑆 = 𝑌1 𝑍1 = 𝑋2 𝑌2 𝑍2 𝑟11 … 𝑟𝑓1 𝑠11 … 𝑠𝑓1 . 𝑟12 𝑟13 … 𝑟𝑓3 𝑠13 … 𝑠𝑓3 𝑟𝑓2 𝑠12 𝑠𝑓2 . . 𝑋𝑝 𝑌𝑝 𝑍𝑝 2𝑓×3 3×𝑝 Objective: matrix notation Find 𝑇 and 𝑆 that minimize 𝑀 − 𝑇𝑆 𝐹 Subject to 𝒓𝑖 = 𝒔𝑖 = 1 𝒓𝑖 𝑇 𝒔𝑖 = 0 𝑀 is 2𝑓 × 𝑝, 𝑇 is 2𝑓 × 3, 𝑆 is 3 × 𝑝 𝑀 = 𝑇𝑆 + Noise 𝑥11 𝑥12 … 𝑥𝑓1 𝑥𝑓2 𝑦11 𝑦12 .. 𝑦𝑓1 𝑦𝑓2 𝑟11 𝑟12 … 𝑟𝑓1 𝑟𝑓2 = 𝑠 11 𝑠12 … 𝑠𝑓1 𝑠𝑓2 . . . . . . . . . . . . 𝑟13 … 𝑟𝑓3 𝑠13 … 𝑠𝑓3 𝑥1𝑝 … 𝑥𝑓𝑝 𝑦1𝑝 … 𝑦𝑓𝑝 𝑋1 𝑌1 𝑍1 2𝑓×3 2𝑓×𝑝 … … 𝑋𝑝 𝑌𝑝 𝑍𝑝 + Noise 3×𝑝 TK-Factorization 𝑀 = 𝑇𝑆 + Noise Step 1: find rank 3 approximation to 𝑀 using SVD 𝑀 = 𝑈Σ𝑉 𝑇 where 𝑈 is 2𝑓 × 2𝑓, 𝑈 𝑇 𝑈 = 𝐼, Σ = diag(𝜎1 , 𝜎2 , … ), size 2𝑓 × 𝑝, and 𝜎1 ≥ 𝜎2 ≥ ⋯ ≥ 0 𝑉 is 𝑝 × 𝑝, 𝑉 𝑇 𝑉 = 𝐼 TK-Factorization 𝑀 = 𝑈Σ3 𝑉 𝑇 where Σ3 = diag(𝜎1 , 𝜎2 , 𝜎3 , 0, 0, … ) Note: this is a relaxation, only noise components outside the 3D space are annihilated Step 2: factorization 𝑇 = 𝑈 Σ3 Ambiguity: 𝑆= Σ3 𝑉 𝑇 𝑀 = (𝑇𝐴)(𝐴−1 𝑆) for any non-singular, 3 × 3 matrix 𝐴 TK-Factorization Step 3: resolve ambiguity 𝒓𝑖 = 𝒔𝑖 = 1 𝒓𝑖 𝑇 𝒔𝑖 = 0 𝒓𝑖 𝑇 Let 𝑅𝑖 = 𝒔𝑖 𝑇 𝒓𝑖 𝑇 Let 𝑇𝑖 = 𝒔𝑖 𝑇 , note that 𝑅𝑖 𝑅𝑖 𝑇 = 𝐼 2×3 be the corresponding rows in 𝑇, then 2×3 𝑅𝑖 = 𝑇𝑖 𝐴 Find a 3 × 3 symmetric matrix 𝐴𝐴𝑇 𝑇 𝑇 𝑇𝑖 𝐴𝐴 𝑇𝑖 = 𝑅𝑖 𝑅𝑖 𝑇 = 𝐼 TK-Factorization 𝑇 𝑇𝑖 𝐴𝐴 𝑇𝑖 = 𝑅𝑖 𝑅𝑖 𝑇 = 𝐼 • Equation is linear in 𝐴𝐴𝑇 • There are 3𝑓 equations in 6 unknowns • Find 𝐴 by eigen-decomposition 𝐴𝐴𝑇 = 𝑊∆𝑊 𝑇 so that 𝐴=𝑊 ∆ • Solution is obtained up to a rotation ambiguity 𝑇 𝑇 𝑇 𝑇𝑖 (𝐴𝐵)(𝐵 𝐴 )𝑇𝑖 such that 𝐵𝐵𝑇 = 𝐼 𝑇 TK-Factorization: Summary 1. Eliminate translation, construct 𝑀 2. 𝑆𝑉𝐷(𝑀) to get rank 3 𝑀 and factorize 𝑀 = 𝑇𝑆 (3 × 3 ambiguity 𝐴 remains) 3. Resolve ambiguity: estimate 𝐴𝐴𝑇 by exploiting orthonormality of each rotation, then factorize to obtain 𝐴 Final solution up to rotation and reflection TK-Factorization: pros and cons • Advantages: • Breaks a difficult, non-linear optimization into simple optimization steps • Works well with errors • Disadvantage: • Orthographic projection • Requires complete tracks Factorization with incomplete tracks • Need a way to approximate by a low rank matrix with missing data min 𝑊 ⊙ (𝑋 − 𝑀) rank 𝑋 =3 𝑊 a mask, 𝑊𝑖𝑗 = 1 wherever 𝑀𝑖𝑗 is known • This problem is NP-hard • Surrogate: minimize the nuclear norm – sum of singular values, 𝜎1 + 𝜎2 + 𝜎3 + ⋯ • Nuclear norm is convex, minimization often achieves low rank • Better iterative procedures exist Perspective multiview stereo • A point 𝑃 = (𝑋, 𝑌, 𝑍) is projected to 𝑓𝑋 𝑓𝑌 𝑥= 𝑦= 𝑍 𝑍 • A point rotated by 𝑅 and translated by 𝒕 projects to 𝑓(𝒓2 𝑇 𝑃 + 𝑡𝑦 ) 𝑓(𝒓1 𝑇 𝑃 + 𝑡𝑥 ) 𝑥= 𝑦= 𝑇 𝒓3 𝑃 + 𝑡𝑧 𝒓3 𝑇 𝑃 + 𝑡𝑧 𝒓𝑖 𝑇 denotes the rows of 𝑅 Bundle adjustment • Given 𝑝 points in 𝑓 frames (𝑥𝑖𝑗 , 𝑦𝑖𝑗 ) find camera matrices 𝐶𝑖 and positions 𝑃𝑗 that minimize 𝑓 𝑝 𝑖=1 𝑗=1 𝑇 𝑓(𝒓𝑖1 𝑃𝑗 + 𝑡𝑥 ) − 𝑥𝑖𝑗 𝑇 𝒓𝑖3 𝑃𝑗 + 𝑡𝑧 2 𝑇 𝑓(𝒓𝑖2 𝑃𝑗 + 𝑡𝑦 ) + − 𝑦𝑖𝑗 𝑇 𝒓𝑖3 𝑃𝑗 + 𝑡𝑧 • Alternate optimization • Given 𝑅𝑖 and 𝒕𝒊 , solve for 𝑃𝑗 • Given 𝑃𝑗 solve for 𝑅𝑖 and 𝒕𝒊 • Very good initial guess is required 2 Bundler (Photo Tourism) (Snavely et al.) Bundler (Photo Tourism) • Given images, identify feature points, describe them with SIFTs • Match SIFTs, accept each match 𝑝𝑖 ↔ 𝑝𝑗 whose score is at least twice of any other match 𝑝𝑖 ↔ 𝑝𝑘 • For every pair of images with sufficiently many matches use RANSAC to recover Essential matrices • Starting with two images and adding one image at a time: use essential matrix to recover depth and apply bundle adjustment Simultaneous solutions • 𝐸𝑖𝑗 = 𝒕𝑖𝑗 𝑅𝑖𝑗 : Essential matrix between 𝐼𝑖 and 𝐼𝑗 , × 𝑖, 𝑗 = 1, … , 𝑓, available on a subset of image pairs • Objective: recover camera orientation 𝑅𝑖 and location 𝒕𝑖 relative to a global coordinate system • First step: recover rotations: 𝑇 min 𝑅𝑖𝑗 − 𝑅𝑖 𝑅𝑗 𝑅𝑖 𝐹 • This can be solved in various ways, for example min 𝑅𝑖𝑗 𝑅𝑗 − 𝑅𝑖 : least squares solution if we 𝑅𝑖 𝐹 ignore the orthonormality constraints for 𝑅𝑖 Epipolar relation in global coordinates • The epipolar line relation, 𝑝𝑇 𝐸𝑖𝑗 𝑞 = 0 can be written in a global coordinate system as follows 𝑝𝑇 𝑅𝑖 𝑇 𝒕𝑖 × − 𝒕𝑗 𝑅𝑗 𝑞 = 0 × • This generalizes the formula for the essential matrix (plug in 𝑅𝑖 = 𝐼, 𝒕𝑖 = 𝟎) • Once camera orientations 𝑅𝑖 are known we can solve for camera locations (equation is linear and homogeneous in the translation components) • Solution suffers from shrinkage problems Multiview reconsruction

Geometry Slides (part 4) - Weizmann Institute of Science

Related documents

Products

Support

Geometry Slides (part 4) - Weizmann Institute of Science

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib