Visual-Geometric 3-D Scene Reconstruction from Uncalibrated Image Sequences

advertisement
M
I
P
Visual-Geometric
3-D Scene Reconstruction
from Uncalibrated Image Sequences
Multimedia Information Processing
Reinhard Koch and Jan-Michael Frahm
Tutorial at DAGM 2001, München
Multimedia Information Processing Group
Christian-Albrechts-University of Kiel
Germany
{rk | jmf}@mip.informatik.uni-kiel.de
www.mip.informatik.uni-kiel.de
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
1
Scene Reconstruction Method 1
• use ruler to measure scene geometry
1
3
ez
2
2
3
ey
1
1
+ precize, absolute geometry
2
- tedious manual task
- difficult to incorporate visual appearance
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
3
ex
2
Scene Reconstruction Method 2
• measure scene with camera, using image projections!
ez
ey
+ better to automate
+ incorporate images for appearance
- how can it be done ?
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
ex
3
Goal of this tutorial
Computer vision enables us to reconstruct highly naturalistic
computer models of 3D enviroments from camera images
We may need to extract the camera geometry (calibration),
scene structure (surface geometry) as well as the visual
appearance (color and texture) of the scene
This tutorial will
–
–
–
–
M
I
P CAU
Multimedia Information Processing
Kiel
introduce the basic mathematical tools (projective geometry)
derive models for cameras, image mappings, 3D structure
give examples for image-based panoramic modeling
explain geometric and visual models of 3D scene reconstruction
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
4
Outline of Tutorial
1. Basics on affine and projective geometry
2. Image mosaicing and panoramic reconstruction
Coffee break
3. 3-D scene reconstruction from multiple views
4. Plenoptic modeling
Demonstrations on mosaicing and 3-D modeling
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
6
Part 1:
Basics on affine and projective geometry
• Affine geometry
– affine points and homogeneous coordinates
– affine transformations
• Projective geometry
– projective points and projective coordinates
– projective transformations
• Pinhole camera model
– Projection and sensor model
– camera pose and calibration matrix
• Image mapping with planar homographies
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
7
Affine coordinates
M
ei: affine basis vectors
o: coordinate origin
e3
G
M
a3
Vector relative to o:
G
G
G
G
M = a1e1 + a2e2 + a3e3
e2
a2
o
e1
a1
Point in affine coordinates:
G G
G
G
G G
M = M + o = a1e1 + a2e2 + a3e3 + o
Vector: relative to some origin
Point: absolute coordinates
M
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
8
Homogeneous coordinates
M
Unified notation:
include origin in affine basis
e3
a3
e2
a2
o
a1
Affine basis matrix
M
I
P CAU
Multimedia Information Processing
Kiel
e1
G G G
e1 e2 e3
M=
0 0 0
Homogeneous
Coordinates of M
 a1 
G
o  a2 
1 a3 
 
 1
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
9
Euclidean coordinates
Euclidean coordinates: affine basis is orthonormal
0 
G
G  
e3 = ez = 0 
 1
0 
G  
o = 0 
0 
0 
G
G
 
e2 = ey =  1
0 
 1
G G
 
e1 = ex = 0 
0 
G G G
e1 e2 e3
Euclidean basis: 
0 0 0
M
I
P CAU
Multimedia Information Processing
Kiel
 1 for i = k
eiT ek = 
0 for i ≠ k
1
G
o  0
=
1 0

0
0 0 0
1 0 0

0 1 0

0 0 1
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
10
Metric coordinates
Metric coordinates: affine basis is orthogonal
0 
G
G  
e3 = ez = 0 
s 
0 
G  
o = 0 
0 
0 
G
G
 
e2 = ey = s 
0 
s 
G G
 
e1 = ex = 0 
0 
G G G
e1 e2 e3
Metric basis:  0 0 0

M
I
P CAU
Multimedia Information Processing
Kiel
s for i = k , s ≠ 0
eTi ek = 
 0 for i ≠ k
1
G
0
o
= s

0
1

0
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
0 0 0
1 0 0

0 1 0

0 0 1
11
Affine Transformation
Transformation Taffine combines linear mapping and
coordinate shift in homogeneous coordinates
– Linear mapping with A3x3 matrix
– coordinate shift with t3 translation vector
M
Taffine
M’
t3 
 A3 x 3
M ′ = TaffineM = 
M

0 0 0 1 
Linear mapping with A3x3:
o
t3
M
I
P CAU
Multimedia Information Processing
Kiel
o
o’
A3x3
- Euclidean rigid rotation
- metric isotropic scaling
- affine skew and anisotropic scaling
- preserves parallelism and length ratio
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
12
Properties of affine transformation
t3 
 A3 x 3
′
=
=
M TaffineM 
M
0
0
0
1


Taffine
 a11 a12

a
a
=  21 22
a31 a32

0
0
a13
a23
a33
0
tx 

ty 
tz 

1
• Parallelism is preserved
• ratios of length, area, and volume are preserved
• Transformations can be concatenated:
if M1 = T1M and M 2 = T2M1 ⇒ M 2 = T2T1M = T21M
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
13
Special transformation: Rotation
TRotation


R3 x 3
=


0 0 0
0   r11 r12
0  r21 r22
=
0   r31 r32
 
1  0 0
r13
r23
r33
0
0
0 
0

1
• Rigid transformation: Angles and length preserved
• R is orthonormal matrix defined by three angles around
three coordinate axes
ez
ey
α
ex
M
I
P CAU
Multimedia Information Processing
cos α

Rz =  sinα
 0
− sinα 0 
cosα 0 
0
1
Rotation with angle α around ez
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
15
Special transformation: Rotation
• Rotation around the coordinate axes can be concatenated:
ez
ey
α
β
γ
ex
R = Rz R y R x
Inverse of rotation matrix:
−1
R =R
M
I
P CAU
Multimedia Information Processing
Kiel
T
0
1

Rx = 0 cos γ
0 sin γ
0 
− sin γ 
cos γ 
 cos β

Ry =  0
 − sin β
0 sin β 
1
0 
0 cos β 
cos α

Rz =  sinα
 0
− sinα 0 
cosα 0 
0
1
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
16
Projective geometry
• Projective space (3 is space of rays emerging from O
– view point O forms projection center for all rays
– rays v emerge from viewpoint into scene
– ray g is called projective point, defined as scaled v: g=λv
w
(3
 x′
v =  y ′
 1 
v
y
O
M
I
P CAU
Multimedia Information Processing
x
g (λ ) =  y  = λv , λ ∈ ℜ ≠ 0
w 
x
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
17
Projective and homogeneous points
• Given: Plane Π in *2 embedded in (3 at coordinates w=1
– viewing ray g intersects plane at v (homogeneous coordinates)
– all points on ray g project onto the same homogeneous point v
– projection of g onto Π is defined by scaling v=g/λ = g/w
w
Π(*2)
w=1
 x′  x 
g (λ ) = λv = λ  y ′ =  y 
 1  w 
(3
v
y
O
M
I
P CAU
Multimedia Information Processing
Kiel
x
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
x
w 
 x′
 
y
1


v =  y ′ = g =  
w 
w
 1 
 
1
 
18
Finite and infinite points
• All rays g that are not parallel to Π intersect at an affine point v on Π.
• The ray g(w=0) does not intersect Π. Hence v∞ is not an affine point
but a direction. Directions have the coordinates (x,y,z,0)T
• Projective space combines affine space with infinite points (directions).
g
(3
Π(*2)
w=1
g1
g2
v
v1
v2
O
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
v∞
x
 
g∞ =  y 
 0 
19
Relation between affine and projective points
• Affine space is embedded into projective space (x,y,z,w)T as
hyperplane Π =(x,y,z,1)T.
• Homogeneous coordinates map a projective point onto the affine
hyperplane by scaling to w=1.
• Projective points with w ≠0 are projected onto Π by scaling with 1/w.
• Projective points with w=0 form the infinite hyperplane Π∞ = (x,y,z,0)T.
They are not part of Π.
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
20
Affine and projective transformations
• Affine transformation leaves infinite points at infinity
 X ′  a11 a12
 ′ 
Y
a
a
⇒   =  21 22
 Z ′  a31 a32
  
0
0 0
M∞' = TaffineM∞
a13
a23
a33
0
tx   X 
 
t y  Y 
tz   Z 
 
1  0 
• Projective transformations move infinite points into finite affine space
M ' = Tprojective M∞
tx   X 
 a11 a12 a13
 xp 
 X ′

 
 
 ′
a
a
a
t
y
Y
21
22
23
y
p
 Y 
⇒  =λ  =λ
 a31 a32 a33
 zp 
 Z′ 
tz   Z 

 
 
 
1
w ′ 
w 41 w 42 w 43 w 44   0 
Example: Parallel lines intersect at the horizon (line of infinite points).
We can see this intersection due to perspective projection!
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
21
Pinhole Camera (Camera obscura)
Interior of camera obscura
(Sunday Magazine, 1838)
Camera obscura
(France, 1830)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
22
Pinhole camera model
aperture
View direction
object
image
M
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
23
Pinhole camera model
Im
es
g
a
r
so
n
e
aperture
r
nte )
e
C , cy
(c x
View direction
object
Focal length f
image
M
I
P CAU
Multimedia Information Processing
Kiel
lens
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
24
Perspective projection
• Perspective projection in (3 models pinhole camera:
– scene geometry is affine *3 space with coordinates M=(X,Y,Z,1)T
– camera focal point in O=(0,0,0,1)T, camera viewing direction along Z
– image plane (x,y) in Π(*2) aligned with (X,Y) at Z= Z0
– Scene point M projects onto point Mp on plane surface
X 
 
Y
M= 
Z 
 
1
Z (Optical axis)
(3
Z0
Π(*2)
y
x
Image plane
Mp
X 
 x p   Z0ZX 
Y 
 y   Z0Y 
Z
p
0
 
M p =   =  ZZ0Z  =


 Z0 
Z Z 
Z
Z
   
1
1


   
 Z0 
Y
O
Camera center
M
I
P CAU
Multimedia Information Processing
X
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
25
Projective Transformation
• Projective Transformation maps M onto Mp in *3 space
ρ
Mp
Y
O
X
M
 xp   X  1
    0
y
Y
ρ M p = TpM ⇒ ρ  p  =   = 
 Z0   Z  0
    0
 1  W  
ρ=
0  X 
0  Y 
 
1 0  Z 
 
1
0
z0
  1 
0 0
1 0
0
0
Z
= projective scale factor
Z0
• Projective Transformation linearizes projection
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
26
Perspective Projection
Dimension reduction from *3 into *2 by projection onto Π(*2)
M
 xp  1
1
 x p  0
y
 y p  = 0
Z0p  0
 1  0
 1  0
mp
Π (*2)
Y
O
0
0
1
1
0
0
0
0
0
0
0
1
0
0
0  xp 
0
0   y p 
0
⇒ mp = DpM p
0   Z0 
1  
1  1 
X
Perspective projection P0 from *3 onto *2:
 x  1 0 0

ρ mp = DpTpM = P0M ⇒ ρ  y  = 0 1 0
 1  0 0 z10

M
I
P CAU
Multimedia Information Processing
Kiel
X 
0  
 Y
Z
0   , ρ =
Z 
Z0
0   
1
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
27
Projection in general pose
P
Ro
ta
tio
n
mp
[R
Projection center C
R
Tcam =  T
0
]
Projection: ρ mp = PM
M
C
1 
−1
cam
Tscene = T
P0
RT
= T
0
−RT C 

1 
World coordinates
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
28
Image plane and image sensor
• A sensor with picture elements (Pixel) is added onto the image plane
Pixel coordinates
m = (y,x)T
y
x
Pixel scale
f= (fx,fy)T
Z (Optical axis)
Image center
c= (cx, cy)T
Image sensor
Focal length Z0
Y
X
Image-sensor mapping:
m = Kmp
Projection center
• Pixel coordinates are related to image coordinates by affine
transformation K with five parameters:
fx s c x 


K =  0 fy c y 
– Image center c at optical axis
 0 0 1 
– distance Zp (focal length) and Pixel size scales pixel resolution f
– image skew s to model angle between pixel rows and columns
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
29
Projection matrix P
• Camera projection matrix P combines:
– inverse affine transformation Ta-1 from general pose to origin
– Perspective projection P0 to image plane at Z0 =1
– affine mapping K from image to sensor coordinates
scene pose transformation: Tscene
 1 0 0 0
projection: P0 = 0 1 0 0  = [I 0]
0 0 1 0 
R T
= T
0
−RT C 

1 
fx s c x 
sensor calibration: K =  0 fy c y 
 0 0 1 
⇒ ρ m = PM, P = KP0Tscene = K RT
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
−RT C 
30
The planar homography H
• The 2D projective transformation Hik is a planar homography
– maps any point on plane i to corresponding point on plane k
– defined up to scale (8 independent parameters)
– defined by 4 corresponding points on the planes with not more than
any 2 points collinear
plane k
mk Hik mi
mk
H ik
plane i
mi
M
I
P CAU
Multimedia Information Processing
 h1
Hik = h4
 h7
h2
h5
h8
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
h3 
h6 
1 
31
Estimation of H from image correspondences
• Hik can be estimated linearily from corresponding point pairs:
– select 4 corresponding point pairs, if known noise-free
– select N>4 corresponding point pairs, if correspondences are noisy
– compute H such that correspondence error f is minimized
Projective mapping:
 xk 
 h1xi + h2 y i + h3 
 


mk = ρk  y k  = Hmi = h4 xi + h5 y i + h6 
 w 
 h7 xi + h8 y i + 1 
Error functional f:
N
f = ∑ ( mk ,n − Hik mi ,n )2 ⇒ min!
n =0
M
I
P CAU
Multimedia Information Processing
Kiel
Image coordinate mapping:
xk =
h1xi + h2 y i + h3
h7 xi + h8 y i + 1
yk =
h4 xi + h5 y i + h6
h7 xi + h8 y i + 1
 h1

Hik = h4
 h7
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
h2
h5
h8
h3 

h6 
1 
32
Image mapping with homographies
• Homographies are 2D projective transformations H3x3
• Homographies map points between planes
• 2D homographies can be used to map images between
different camera views for three equivalent cases:
– (a) all cameras share the same view point Ci = C, or
– (b) all scene points are at (or near to) infinity, or
– (c) the observed scene is planar.
• The 2D homography is independent of 3D scene structure!
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
33
(a) Image mapping for single view point
• Camera with fixed projection center: Ci =C
• Camera rotates freely with Ri and changing calibration Ki
G
MC = [I
 X − Cx 
−C ]M = Y − CY 
 Z − CZ 
ρ i mi = Pi M = K i RiT
M
ρk
mk
mi
ρi
−RiT Ci  M
G
T
T
= K i Ri [I −C ]M = K i Ri MC
G
T
T
ρk mk = K k Rk [I −C ]M = K k Rk MC
G
⇒ MC = Ri K i−1ρ i mi = Rk K k−1ρ k mk
ρ k mk = K k Rk−1Ri K i−1ρ i mi = ρ i Hik mi
Y
C
X
• Hik is a planar projective 3x3 transformation that maps
points mi on plane i to points mk on plane k
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
34
(b) Image mapping for infinite scene with M∞
• All scene points are at or near infinity: M∞ are points on Π∞
• Camera rotates freely with Ri and changing calibration Ki
M∞
 X − 0
−Ci ]M∞ = Y − 0 
 Z − 0 
G
M∞ = [I
M∞
mk
mi
Y
Ci
X
Ck
G
M ∞ = Ri K i−1ρ i mi = Rk K k−1ρ k mk
M
I
P CAU
Multimedia Information Processing
Kiel
⇒ ρ k mk = K k Rk−1Ri K i−1ρ i mi = ρ i Hik mi
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
35
(c) Image mapping of planar scene ΠM
• All scene points are at on plane ΠM
• Camera is completely free in K,R,C
M
HiM
ΠM
H kM
H ik
mk
mi
Y
Ci
X
Transfer between images i,k over ΠM:
M
I
P CAU
Multimedia Information Processing
Kiel
Ck
−1
Hik = HiM H kM
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
36
Part 2:
Mosaicing and Panoramic Images
• why mosaicing?
• geometries constraints for mosaicing
• mosaicing
• projective mainfolds
• stereo mosaicing
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
37
Why mosaicing?
• cameras field of view is always smaller than
human field of view
• large objects can‘t be captured in a single picture
Solutions
• devices with wide field of view
- fish-eye lenses (distortions, decreases quality)
- hyper- and parabolic optical devices (lower
resolution)
• image mosaicing
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
38
What is mosaicing?
definition of mosaicing : Matching multiple
images by aligning and pasting images to a
wilder field of view image. Features that continue
over the images must be "zipped" together, and
the frame edges dissolved.
mosaicing
Kang et. al. (ICPR 2000)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
39
Geometries of mosaic aquisition
arbitrary scene
planar scene
rotating
camera
M
I
P CAU
Multimedia Information Processing
Kiel
translating and rotating
camera
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
40
Steps of mosaicing
M
•
Image alignment: estimation of homography H
for each image pair
•
Image cut and paste: selection of colorvalue
for each mosaic pixel
•
Image blending: overcome of intensity
differences between images
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
41
Foreward mapping to plane
M
pixel to direction
(P0Tscene )
−1
mM
−1
scene
−1
m =T
K
N
mp
= M∞
mp
 I  −1
0T  K m
 
direction to mosaic
ρ mM = K mosaic P0Tmosaic M∞
I
T
0
usually Tmosaic = 
C
0
, K mosaic = K

1
T
−1
K
⇒ ρ mM = KR
m
H
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
42
Forward mapping to plane
M
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
43
Forward mapping to sphere
pixel to direction
M
A0
(P0Tscene )
−1
mM
= M∞
mp
 I  −1
−1
−1
m
=
K
T
N
scene  T  K m
mp
0 
direction to polar
X
ϕ = tan  
Z
−1
ϕ


Y
ϑ = tan 

2
2
 X +Z 
−1
C
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
44
Planar scene mapping
pixel to direction
M
(P0Tscene )
−1
mM
−1
scene
K
m =T
N
−1
mp
= M∞
 I  −1
0T  K m
 
direction to mosaic
mp
ρ mM = K mosaic P0Tmosaic M∞
usually Tmosaic
C
M
I
P CAU
Multimedia Information Processing
Kiel
I
= T
0
0
, K mosaic = K

1
⇒ ρ mM = KP0 RT − RT C  P0−1K −1m
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
45
Backward mapping from plane
M
mosaic to direction
−1
M∞ = (P0Tmosaic ) K
mM
mosaic
−1
mM
M,p
m
−1
mosaic
=T
 I  −1
0T  K mM
 
mM , p
direction to pixel
ρ m = KP0TsceneM∞
C usually Tmosaic
I
= T
0
0
, K mosaic = K

1
−1
⇒ ρ m = KRK
mM
H −1
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
46
Backward mapping from plane
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
47
mosaics
image acquisition
• pure rotation around optical center
• planar scene translating and rotating camera
problem
• ensure camera motion constaints
mosaic demo
Live-Demonstration of Quicktime-VR Rotation-Panorama
Neo-Gothic Building (Leuven, Belgium)
Rudy Knoops, AVD, K.U. Leuven
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
48
Manifold projection
• simulating sampling of scene with a 1-dim. sensor
array
• sensor can arbitrary translated and rotated in
arbitrary scenes
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
49
Manifold projection
planar scene
rotating
camera
pure translating
camera
rotating and translating camera
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
50
Mosaic from strips
In
In-1
In+1
• stripes F(x,y)=0 perpendicular to optical flow
 ∂F ∂F 
⇒ normal 
,

 ∂x ∂y 
T
of stripe is parallel to optical flow
• select stripe as close as possible to image center
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
51
Manifold projection
•
•
•
•
•
overcomes some of the difficulties of mosaicing
defined for any camera motion and scene structure
resolution is the same as image resolution
simplified computation (real time)
visually good results
Problem
• No metric in mosaic
• Life-Demonstration VideoBrush
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
53
Stereo mosaics
• multiple viewpoint panorama
• for each eye a seperate multiple viewpoint
panorama
• constructed from one rotating camera
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
54
Stereo perception
right eye
rotating
head
circular projection
left eye
image plane
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
55
The camera
C
mosaicing
C
slit
image plane
C
rotation
about A
slit
C
A
A
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
56
Using video camera
• rotate videocamera in the same way as split camera
• use two vertical image strips inplace of splits
C
A
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
57
Stereo panorama
S. Peleg (CVPR `99)
S. Peleg (CVPR `99)
Ommnistereo demo © OmniStereo Cooporation, 2000
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
58
Part 3:
3-D scene reconstruction from multiple views
•
•
•
•
•
M
I
Projective and metric reconstruction
2-view epipolar constraint
camera tracking from multiple views
stereoscopic depth estimation
3-D surface modeling from multiple views
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
59
Scene reconstruction with projective cameras
• Calibrated Camera: K known, Pose (R,C) unknown (metric camera )
P = K [RT | −RT C ]
• Uncalibrated camera: K,R,C unknown (projective camera )
P = [KRT | −KRT C ] = [B | a]
Reconstruction from multiple projective cameras:
• unknown: Scene points M, projection matrices P
• known: image projections mi of scene points Mi
• problem: reconstruction is ambiguous in projective space!
-1
m i PM i P (T T )M i (PT -1 )(TM i ) PM
i
• scene is defined only up to a projective Transformation T
• camera is skewed by inverse Transformation T-1
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
60
Ambiguity in projective reconstruction
Euclidean
scene
T
Projective
reconstruction
Valid projective reconstructions
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
61
Self-Calibration
• Recover metric structure from projective reconstruction
• Use constraints on the calibration matrix K
• Utilize invariants in the scene to estimate K
Apply self-calibration to recover T-1
T-1
Projective reconstruction
M
I
P CAU
Multimedia Information Processing
Kiel
Metric reconstruction
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
62
Camera Self-Calibration from H
• Estimation of H between image pairs gives complete
projective mapping (8 parameter).
• Problem: How to compute camera projection matrix from H
– since K is unknown, we can not compute R
– H does not use constraints on the camera
(constancy of K or some parameters of K)
• Solution: self-calibration of camera calibration matrix K
from image correspondences with H
• imposing constraints on K may improve calibration
Interpretation of H for metric camera:
M
I
P CAU
Multimedia Information Processing
Kiel
 h1

H = h4
 h7
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
h2
h5
h8
h3 

h6  = K k Rk−1Ri K i−1
1 
63
Self-calibration of K from H
• Imposing structure on H can give a complete calibration
from an image pair for constant calibration matrix K
homography Hik = K k Rk−1Ri K i−1
relative rotation: Rk−1Ri = Rik , constant camera: K i = K k = K
Hik = KRik K −1
⇒ Rik = K −1Hik K
since Rik−1 = RikT ⇒ Rik = Rik−T ⇒ Rik = K −1Hik K = K T Hik−T K −T
⇒ KK T = Hik (KK T )HikT
• Solve for elements of (KKT) from this linear equation, independent of R
• decompose (KKT) to find K with Choleski factorisation
• 1 additional constraint needed (e.g. s=0) (Hartley, 94)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
64
Self-calibration for varying K
• Solution for varying calibration matrix K possible, if
– at least 1 constraint from K is known (s= 0)
– a sequence of n image homographies H0i exist
⇒ Rik = K 0−1H0 i K i = K 0T H0−iT K i−T
homography H0 i = K 0R0−1Ri K i−1
⇒ K i K iT = H0 i (K 0K 0T )H0Ti
n −1
solve by minimizing constraint ⇒ ∑ K i K iT − H0 i (K 0K 0T )H0Ti
2
⇒ min!
i =1
• Solve for varying K (e.g. Zoom) from this equation, independent of R
• 1 additional constraint needed (e.g. s=0)
• different constraints on Ki can be incorporated (Agapito et. al., 01)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
65
Multiple view geometry
Projection onto two views:
P0 = K 0R0−1 [I 0]
P1 = K1R1−1 [I
ρ0 m0 = P0M = K 0R0−1 [I 0]M
ρ1m1 = P1M = K1R1−1 [I −C1 ]M
⇒ ρ0 m0 = K 0R0−1 [I 0]M∞
= K1R1−1 [I 0]M∞ + K1R1−1 [I
⇒ ρ1m1 = ρ0H∞ m0 + e1
ρ0M∞
M∞ H m
∞ 0
Z
m0
m1
Y
O
P1
e1
X
C1
M
I
P CAU
Multimedia Information Processing
Kiel
−C1 ]O
ρ1m1 = K1R1−1R0 K 0−1ρ0 m0 − K1R1−1C1
M
P0
−C1 ]
Epipolar line
 X   X  0 
    0 
Y
Y
M =   =   +   = M∞ + O
 Z   Z  0 
     
 1   0   1
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
66
The Fundamental Matrix F
• The projective points e1 and (H∞m0) define a plane in
camera 1 (epipolar plane !e)
• the epipolar plane intersect the image plane 1 in a line
(epipolar line le)
• the corresponding point m1 lies on that line: m1Tle= 0
• If the points (e1),(m1),(H∞m0) are all collinear, then the
collinearity theorem applies:
collinearity of m1, e1, H∞ m0 ⇒ m1T ([e1 ]x H∞ m0 ) = 0
[ ]x =
cross operator
Fundamental Matrix F
F = [e1 ]x H ∞
M
I
P CAU
Multimedia Information Processing
Kiel
F3 x 3
Epipolar constraint
m1T Fm0 = 0
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
67
The Fundamental Matrix F
l1 = Fm0
m l =0
T
1 1
m1T Fm0 = 0
F = [e]xHa = Fundamental Matrix
P0
L
m0
M
Ma
l1
Epipole
e1
m1
e1T F = 0
m1a =Ham0
P1
M
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
68
Estimation of F from image correspondences
• Given a set of corresponding points, solve linearily for the 9
elements of F in projective coordinates
• since the epipolar constraint is homogeneous up to scale,
only eight elements are independent
• since the operator [e]x and hence F have rank 2, F has only
7 independent parameters (all epipolar lines intersect at e)
• each correspondence gives 1 collinearity constraint
=> solve F with minimum of 7 correspondences
for N>7 correspondences minimize distance point-line:
m Fm0 i = 0
T
1i
M
I
P CAU
Multimedia Information Processing
Kiel
N
∑ (m
n =0
T
1,n
Fm0,n )2 ⇒ min!
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
69
Estimation of P from F
• From F we can obtain a camera projection matrix pair:
– Set P0 to identity
– compute P1 from F (up to projective transformation)
– reduce projective skew by initial estimate of K to obtain a quasieuclidean estimate (Pollefeys et.al., ‘98)
– compute self-calibration to obtain metric K similar to self-calibration
from H (Pollefeys et.al, ‘99, Fusiello ‘00)
Fundamental matrix
m1T Fm0 = 0
e1T F = 0
M
I
P CAU
Multimedia Information Processing
↔
Projective camera
P0 = [I | 0]
P1 = [[e1 ]x F + e1aT | e]
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
70
3D Feature Reconstruction
• corresponding point pair (m0, m1) is projected from
3D feature point M
• M is reconstructed from by (m0, m1) triangulation
• M has minimum distance of intersection
2
d ⇒ min!
P0
m0
l0
m1
F
M
I
P CAU
Kiel
constraints:
d
l0T d = 0
l1T d = 0
l1
minimize reprojection error:
P1
Multimedia Information Processing
M
(m0 − P0M) + (m1 − P1M)
2
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
2
⇒ min.
71
Multi View Tracking
• 2D match: Image correspondence (m1, mi)
• 3D match: Correspondence transfer (mi, M) via P1
• 3D Pose estimation of Pi with mi - Pi M => min.
P0
M
m0
3D match
mi
m1
P1
N
Minimize lobal reprojection error:
K
∑∑ m
i =0 k =0
M
I
P CAU
Multimedia Information Processing
Kiel
Pi
2D match
k ,i
− Pi Mk
2
⇒ min!
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
72
Structure from motion: an example
Image Sequence
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
73
Extraction of image features
m2
Image 2
m1
Image 1
• features m1,2 (Harris Cornerdetector)
• Select candidates based on cross correlation
• Test candidates
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
74
Robust selection of correspondences
RANSAC (RANdom Sampling Consensus)
Hypothesis test:
• For N Trials
DO:
– random selection of possible correspondences (minimum set)
– compute F
– test all correspondences [inliers/outliers]
UNTIL ( Probability[#inliers, #Trials] > 95%)
• Refine F with ML-estimate
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
75
Estimation of Fundamental Matrix
Image 2
m 2T F m 1 = 0
Image 1
Robust correspondence selection m1 <-> m2
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
76
Camera and feature tracking
reconstruction of 3D features and cameras
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
77
3D surface modeling
• camera calibration from feature tracking
• dense depth estimation from stereo correspondence
• depth fusion and generation of textured 3D surface model
Image sequence
M
I
P CAU
Multimedia Information Processing
Kiel
camera calibration
scene geometry
3D surface model
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
78
Dense depth estimation
• use of constraints along epipolar line (ordering, uniqueness)
• matching with normalized cross correlation
• fast matching in rectified image pair
Epipolar line search
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
79
Image rectification with planar mappings
map between images
estimate disparity D
mr = Hr−1mr' = Hr−1Dlr ml' = Hr−1Dlr Hl ml
 1 0 −d 


Dlr (ml' , mr' ) = 0 1 0 
0 0 1 
M
depth ρ 1
d
M∞
ρ
mr' = Dlr ml'
'
l
m
mr'
d
ml
Cl
M
I
P CAU
Multimedia Information Processing
Kiel
Hl
Hr
mr
Cr
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
80
Dense search for D
• use of constraints along epipolar line (ordering, uniqueness)
• matching with normalized cross correlation
• constrained epipolar search with dynamic programming
Epipolar lines
M
I
P CAU
Multimedia Information Processing
Kiel
Search path for constrained matching
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
81
Depth map
Originals
Depth map 7(m)
shaded surface
model
Problem: Triangulation angle limits
depth resolution
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
82
The Baseline problem
• small baseline stereo:
+ Correspondence is simple (images are similar)
+ few occluded regions
- large depth uncertainty
Large depth uncertainty
C2
C1
Small camera baseline (small triangulation angle)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
83
The Baseline problem
• wide baseline stereo:
- Correspondence is difficult (images are not similar)
- many occluded regions
+ small depth uncertainty
Small depth error
C2
C1
Large camera baseline (large triangulation angle)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
84
Multi-Viewpoint Depth Fusion
• Concatenate correspondences over adjacent image pairs
• depth triangulation along line of sight
• depth fusion of all triangulated correspondences, remove outliers
Depth fusion
M
I
P CAU
Multimedia Information Processing
Kiel
Detection of outliers
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
85
Depth uncertainty after sequence integration
• Improved density, fewer occlusions
• Depth error decreases
• Improved surface geometry
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
86
3D surface Modeling
Depth map
surface mesh
er
nd
re
in
g
ing
p
p
ma
Texture map
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
87
Jain Tempel (Ranakpur, Indien)
Images
M
I
P CAU
Multimedia Information Processing
Kiel
3D-model
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
88
Sagalassos: Virtual Museum
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
89
Open Questions from geometric modeling
• How much geometry is really needed for visualisation ?
• Trade-off: modeling vs. texture mapping (reflectance,
microstructure)
• how can we model surface reflections ?
• How can we efficiently store and render the scenes ?
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
90
Part 4:
Plenoptic Modeling
•
•
•
•
M
I
P CAU
Multimedia Information Processing
Kiel
The lightfield
depth-dependent view interpolation
The uncalibrated Lumigraph
Visual-geometric modeling: view-dependent interpolation
and texture mapping
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
91
Modeling of scenes with surface reflections
• Problem when modeling scenes with reflections
– view dependent reflections can not be handled by single
texture map
– may not be able to recover geometry
• Approach: Lightfield rendering
M
I
P CAU
Multimedia Information Processing
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
Kiel
92
Image-based Acquisition: The Lightfield
View ray
v
(u,v,s,t)
u
t
focal plane
(images)
s
3D scene
viewpoint plane
(cameras)
4-D Lightfield Data structure: Grid of cameras (s,t) and grid of images
(u,v) store all possibe surface reflections of the scene
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
93
Image-based Rendering from Lightfield
Images
v
u
Interpolated view
t
(u,v) grid
(s,t) grid
s
Interpolation errors (ghosting) due to unmodelled geometry!
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
94
Depth-dependent Rendering Error
Si
3D object
Si+1
(s,t)-plane
M
I
P CAU
Multimedia Information Processing
Kiel
(u,v)-plane
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
95
Combining Lightfield and Structure from Motion
Lightfield (LF):
– needs camera calibration and depth estimation
+ efficient rendering of view-dependent scenes
Structure from Motion (SFM):
+ delivers camera calibration and depth estimation
– can not handle view-dependent scenes
Uncalibrated Lumigraph:
+ tracking and scene structure with SFM
+ view-dependent rendering with generalized LF
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
96
Uncalibrated plenoptic modeling
Plenoptic modeling: Build an array of images by concatenating
views from an uncalibrated, freely moving moving video camera
Free-moving video camera
3-D Scene
Plenoptic image array
Plenoptic rendering: Render novel views of the observed scene by
image interpolation from the plenoptic image array
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
97
Plenoptic Modeling from uncalibrated Sequence
Office sequence: sweeping a camera freely over cluttered desk environment
Input sequence
M
I
P CAU
Multimedia Information Processing
Kiel
Viewpoint surface mesh calibration
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
98
Depth Maps and geometry
Calibrated image pairs
M
I
P CAU
Multimedia Information Processing
Kiel
local depth maps
fused geometry
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
99
3D surface model
Calibrated image pairs
M
I
P CAU
Multimedia Information Processing
Kiel
local depth maps
3D surface model
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
100
Viewpoint-Adaptive Rendering
• Render images directly from calibrated camera viewpoints
• Interpolate by approximating local scene geometry
View-dependent geometric approximation
Novel view
Camera viewpoint surface
object surface
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
101
Projective texture mapping on planar scene
Local surface plane
M
HiM
ΠM
H kM
H ik
mk
Texture map i
mi
rendered image k
Y
Ck
X
−1
Transfer between images i,k over ΠM: Hik = HiM HkM
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
102
Viewpoint-Adaptive Rendering
• Render images directly from calibrated camera viewpoints
• Interpolate by approximating local scene geometry
View-dependent geometric approximation
Novel view
Camera viewpoint surface
object surface
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
103
Viewpoint-Adaptive Rendering
• Render images directly from calibrated camera viewpoints
• Interpolate by approximating local scene geometry
• Scalable geometrical approximation through subdivision
View-dependent geometric approximation
Novel view
Camera viewpoint surface
object surface
Rendering: 3 projective mappings and α-blending per triangle (OpenGL Hardware)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
104
Scalable Geometry for View Interpolation
• Adaptation of geometric detail for Lightfield Interpolation
• Geometry is computed online for every rendering viewpoint
1 Surface plane per viewpoint triangle
1 Subdivision per viewpoint triangle
M
I
P CAU
Multimedia Information Processing
Kiel
2 Subdivisions
4 Subdivisions
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
105
Scalable Geometry
Adaptation of Geometric detail to interpolation error
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
106
Viewpoint-Adaptive Geometry
Adaptation of Geometry to viewpoint (2 subdivisions)
M
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
107
Uncalibrated Lumigraph: Rendering Results
Rendering from lightfield calibration
with planar approximation
M
I
P CAU
Multimedia Information Processing
Kiel
Rendering with locally adapted geometry
(viewpoint mesh 2 x subdivided)
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
108
Conclusions
A complete framework for automatically reconstructing and
rendering of 3D scenes from uncalibrated image sequences
was presented. It combines structure from motion with
Image-based rendering.
Properties of the approach:
•
•
•
•
M
handle uncalibrated sequences from freely moving hand-held cameras
calibrate lightfield sequences with viewpoint mesh connectivity
exploit scalable geometric complexity
rendering of surface reflections
I
P CAU
Multimedia Information Processing
Kiel
DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction
109
Download