Visual 3D Modeling using Cameras and Camera Networks Marc Pollefeys

advertisement
Visual 3D Modeling using Cameras and
Camera Networks
Marc Pollefeys
University of North Carolina at Chapel Hill
Talk outline
• Introduction
• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion
– Acquisition of scene structure
– Constructing visual models
• Camera Networks
– Camera Network Calibration
– Camera Network Synchronization
– Towards Active Camera Networks…
• Conclusion
2
Visual 3D Modeling using Cameras and Camera Networks
What can be achieved?
•
•
•
•
•
Can we get 3D models from images?
How much do we need to know about the camera?
Can we freely move around? Hand-held?
Do we need to keep parameters fixed? Zoom?
What about auto-exposure?
• What about camera networks?
• Can we provide more flexible systems? Avoid calibration?
• What about using IP-based PTZ cameras? Hand-held
camcorders?
• Unsynchronized or even asynchronous?
3
Visual 3D Modeling using Cameras and Camera Networks
Talk outline
• Introduction
• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion
– Acquisition of scene structure
– Constructing visual models
• Camera Networks
– Camera Network Calibration
– Camera Network Synchronization
– Towards Active Camera Networks…
• Conclusion
4
Visual 3D Modeling using Cameras and Camera Networks
(Pollefeys et al. ’98)
5
Visual 3D Modeling using Cameras and Camera Networks
(Pollefeys et al. ’04)
Video Key-frame selection
More efficient RANSAC
Fully projective
Improved self-calibration
Deal with dominant planes
Bundle adjustment
Polar stereo rectification
Deal with radial distortion
Faster stereo algorithm
Deal with specularities
Volumetric 3D integration
Deal with Auto-Exposure
Image-based rendering
6
Visual 3D Modeling using Cameras and Camera Networks
Feature tracking/matching
• Shape-from-Photographs: match Harris corners
• Shape-from-Video: track KLT features
Problem: insufficient motion between consecutive video-frames
to compute epipolar geometry accurately and use it
effectively as an outlier filter
7
Visual 3D Modeling using Cameras and Camera Networks
Key-frame selection
Select key-frame when F yields a better model than H
– Use Robust Geometric Information Criterion (Torr ’98)
bad fit penalty
model complexity
– Given view i as a key-frame, pick view j as next key-frame for
first view where GRIC(Fij)>GRIC(Hij) (or a few views later)
H-GRIC
F-GRIC
8
(Pollefeys et al.’02)
Visual 3D Modeling using Cameras and Camera Networks
Epipolar geometry
Underlying structure in set of matches for rigid scenes
1.
2.
3.
P
C1
4.
Computable from
corresponding points
Simplifies matching
Allows to detect wrong
matches
Related to calibration
l1
e1
m F m1  0
T
2
e2
l2
Fundamental matrix
(3x3 rank 2 matrix)
9
C2
Visual 3D Modeling using Cameras and Camera Networks
Epipolar geometry computation:
robust estimation (RANSAC)
Step 1. Extract features
Step 2.Compute a set of potential matches
Step 3. do

Step 3.1 select minimal sample (i.e. 7 matches) (generate hypothesis)
Step 3.2 compute solution(s) for F
Step 3.3 count inliers, if not promising stop (verify hypothesis)


inliers 7 # samples
until (#inliers,#samples)<95%
  1  (1  ##matches
)
Step 4. Compute F based on all inliers
#inliers
90%
80%
70%
60%
50%
#samples
5
13
35
106
382
Step 5. Look for additional matches
Step 6. Refine F based on all correct matches
10
Visual 3D Modeling using Cameras and Camera Networks
Epipolar geometry computation
geometric relations between two views is fully described by
recovered 3x3 matrix F
11
Visual 3D Modeling using Cameras and Camera Networks
Sequential Structure and Motion Computation
Initialize Motion
(P1,P2 compatibel with F)
Initialize Structure
(minimize reprojection error)
Extend motion
Extend structure
(compute pose through matches
(Initialize new structure,
12
Visual
3D Modeling
using Cameras and Camera Networks
seen in 2 or more
previous
views)
refine existing structure)
Dealing with dominant planar scenes
(Pollefeys et al., ECCV‘02)
• USaM fails when common features are all in a plane
• Solution: part 1 Model selection to detect problem
13
Visual 3D Modeling using Cameras and Camera Networks
Dealing with dominant planar scenes
(Pollefeys et al., ECCV‘02)
• USaM fails when common features are all in a plane
• Solution: part 2 Delay ambiguous computations
until after self-calibration
(couple self-calibration over all 3D parts)
14
Visual 3D Modeling using Cameras and Camera Networks
Refine Structure and Motion
• Use projective bundle adjustment
– Sparse bundle allows very efficient computation (2 levels)
– Take radial distortion into account (1 or 2 parameters)
15
Visual 3D Modeling using Cameras and Camera Networks
Self-calibration using absolute conic
(Faugeras ECCV’92; Triggs CVPR’97;
Pollefeys et al. ICCV’98; etc.)
Euclidean projection matrix:
some constraints, e.g. constant, no skew,...
Absolute conic projection:


i
PiΩ P  ω  K iK
T
i
Translate constraints on K
through projection equation
to constraints on *
T
i
*
Upgrade from projective to metric
Transform structure and motion
so that *  diag(1,1,1,0)
16
Visual 3D Modeling using Cameras and Camera Networks
*
 fx
K


Practical linear self-calibration
s
fy
(Pollefeys et al., ECCV‘02)
Don’t treat all constraints equal
PΩ P   PΩ P 
0 0
1

0.01PΩ P   0
fˆ 0
1
PΩ P   0
0 .1
0 1
1
0.1 PΩ P   0
(relatively accurate for most cameras)
1

PΩ P   PΩ P 
9
1 PΩ P   PΩ P 
9
(only rough aproximation,
 fˆ

* T
KK  P P   0

after normalization!  0
1
0 .2
2


T
T
11
22

2
0
T
12

T
13

T
23


T
T
11


T
22
but still usefull to avoid
degenerate configurations)
17
when fixating point at image-center not only
absolute quadric diag(1,1,1,0) satisfies ICCV98 eqs.,
but also diag(1,1,1,a),
real or
imaginary
Visual 3D i.e.
Modeling
using
Cameras spheres!
and Camera Networks
0
33  0
33
T
cx 
cy 

1 
Refine Metric Structure and Motion
• Use metric bundle adjustment
– Use Euclidean parameterization for projection matrices
– Same sparseness advantages, also use radial distortion
18
Visual 3D Modeling using Cameras and Camera Networks
Mixing real and virtual elements in video
Virtual reconstruction of ancient fountain
Preview fragment of sagalassos TV documentary
Similar to 2D3‘s Boujou and RealViz‘ MatchMover
19
Visual 3D Modeling using Cameras and Camera Networks
Intermezzo: Auto-calibration of
Multi-Projector System
hard because screens are
planar, but still possible
20
(Raij and Pollefeys, submitted)
Visual 3D Modeling using Cameras and Camera Networks
21
Visual 3D Modeling using Cameras and Camera Networks
Stereo rectification
• Resample image to simplify matching process
22
Visual 3D Modeling using Cameras and Camera Networks
Stereo rectification
• Resample image to simplify matching process
Also take into account radial distortion!
23
Visual 3D Modeling using Cameras and Camera Networks
Polar stereo rectification
(Pollefeys et al. ICCV’99)
Polar reparametrization of images around epipoles
Does not work with standard
Homography-based approaches
24
Visual 3D Modeling using Cameras and Camera Networks
General iso-disparity surfaces
(Pollefeys and Sinha, ECCV’04)
Example: polar rectification preserves disp.
Application: Active vision
Also interesting relation to human horopter
25
Visual 3D Modeling using Cameras and Camera Networks
Stereo matching
Similarity measure
(SSD or NCC)
Optimal path
(dynamic programming )
Constraints
• epipolar
• ordering
• uniqueness
• disparity limit
• disparity gradient limit
Trade-off
• Matching cost
• Discontinuities
(Cox et al. CVGIP’96; Koch’96; Falkenhagen´97;
Van Meerbergen,Vergauwen,Pollefeys,VanGool IJCV‘02)
26
Visual 3D Modeling using Cameras and Camera Networks
Disparity propagation
(Gaussian pyramid)
Downsampling
Hierarchical stereo matching
27
Allows faster computation
Deals with large disparity
ranges
Visual 3D Modeling using Cameras and Camera Networks
Disparity map
image I(x,y)
Disparity map D(x,y)
(x´,y´)=(x+D(x,y),y)
28
Visual 3D Modeling using Cameras and Camera Networks
image I´(x´,y´)
Example: reconstruct image from neighbors
29
Visual 3D Modeling using Cameras and Camera Networks
Multi-view depth fusion
(Koch, Pollefeys and Van Gool. ECCV‘98)
•
Compute depth for every pixel
of reference image
–
–
–
–
Triangulation
Use multiple views
Up- and down sequence
Use Kalman filter
Also allows to compute robust texture
30
Visual 3D Modeling using Cameras and Camera Networks
Real-time stereo on GPU
(Yang and Pollefeys, CVPR2003)
•
•
•
•
Plane-sweep stereo
Computes Sum-of-Square-Differences (use pixelshader)
Hardware mip-map generation for aggregation over window
Trade-off between small and large support window
(Demo GeForce4)
150M disparity hypothesis/sec (Radeon9700pro)
e.g. 512x512x20disparities at 30Hz
GPU is great for vision too!
31
Visual 3D Modeling using Cameras and Camera Networks
Dealing with specular highlights
(Yang, Pollefeys and Welch, ICCV’03)
Extend photo-consistency model to include highlights
32
Visual 3D Modeling using Cameras and Camera Networks
33
Visual 3D Modeling using Cameras and Camera Networks
3D surface model
Depth image
Texture image
Triangle mesh
Textured 3D
Wireframe model
34
Visual 3D Modeling using Cameras and Camera Networks
Volumetric 3D integration
(Curless and Levoy, Siggraph´96)
Multiple depth images
Volumetric integration
Texture integration
patchwork texture map
35
Visual 3D Modeling using Cameras and Camera Networks
Dealing with auto-exposure
(Kim and Pollefeys, submitted)
• Estimate cameras radiometric response curve,
exposure and white balance changes
• Extends prior HDR work at Columbia, CMU, etc.
to moving camera
brightness
transfer curve
robust estimate using DP
auto-exposure
36
fixed-exposure
response curve model
Visual 3D Modeling using Cameras and Camera Networks
Dealing with auto-exposure
(Kim and Pollefeys, submitted)
Applications:
• Photometric alignment of textures (or HDR textures)
• HDR video
37
Visual 3D Modeling using Cameras and Camera Networks
Part of Jain temple
Recorded during post-ICCV tourist trip in India
(Nikon F50; Scanned)
38
Visual 3D Modeling using Cameras and Camera Networks
Example: DV video  3D model
accuracy ~1/500 from DV video (i.e. 140kb jpegs 576x720)
39
Visual 3D Modeling using Cameras and Camera Networks
Unstructured lightfield rendering
40
Visual 3D Modeling using Cameras and Camera Networks
(Heigl et al.’99)
demo
Talk outline
• Introduction
• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion
– Acquisition of scene structure
– Constructing visual models
• Camera Networks
– Camera Network Calibration
– Camera Network Synchronization
– towards active camera networks…
• Conclusion
41
Visual 3D Modeling using Cameras and Camera Networks
Camera Networks
• CMU’s Dome, 3D Room, etc.
• MIT’s Visual Hull
• Maryland’s Keck lab, ETHZ’s BLUE-C and more
• Recently, Shape-from-Silhouette/Visual-Hull systems
have been very popular
42
Visual 3D Modeling using Cameras and Camera Networks
Camera Networks
• Offline Calibration Procedure
• Special Calibration Data
– Planar Pattern
– moving LED
• Requires physical access to environment
• Active Camera Networks
– How do we maintain calibration ?
43
Visual 3D Modeling using Cameras and Camera Networks
An example
P. Sand, L. McMillan, and J. Popovic.
Continuous Capture of Skin Deformation.
ACM Transactions on Graphics 22, 3,
578-586, 2003.
• 4 NTSC videos recorded by 4 computers for 4 minutes
• Manually synchronized and calibrated using MoCap system
44
Visual 3D Modeling using Cameras and Camera Networks
Can we do without explicit calibration?
• Feature-based?
– Hard to match features between very different views
– Not many features on foreground
– Background often doesn’t overlap much between views
• Silhouette-based?
– Necessary for visual-hull anyway
– But approach is not obvious
45
Visual 3D Modeling using Cameras and Camera Networks
Multiple View Geometry of Silhouettes
• Frontier Points
• Epipolar Tangents
x Fx  0
T
2
1
x Fx  0
T
2
x1
1
x’1
x2
x’2
• Points on Silhouettes in 2 views do not correspond in
general except for projected Frontier Points
• Always at least 2 extremal frontier points per silhouette
• In general, correspondence only over two views
46
Visual 3D Modeling using Cameras and Camera Networks
Calibration from Silhouettes: prior work
Epipolar Geometry from Silhouettes
• Porril and Pollard, ’91
• Astrom, Cipolla and Giblin, ’96
Structure-and-motion from Silhouettes
• Joshi, Ahuja and Ponce’95 (trinocular rig/rigid object)
• Vijayakumar, Kriegman and Ponce’96 (orthographic)
• Wong and Cipolla’01 (circular motion, at least to start)
• Yezzi and Soatto’03 (only refinement)
None really applicable to calibrate visual hull system
47
Visual 3D Modeling using Cameras and Camera Networks
Camera Network Calibration from Silhouettes
(Sinha, Pollefeys and McMillan, submitted)
• 7 or more corresponding frontier points needed to
compute epipolar geometry for general motion
• Hard to find on single silhouette and possibly occluded
• However, Visual Hull systems record many silhouettes!
48
Visual 3D Modeling using Cameras and Camera Networks
Camera Network Calibration from Silhouettes
• If we know the epipoles, it is simple
• Draw 3 outer epipolar tangents (from two silhouettes)
• Compute corresponding line homography H-T (not unique)
• Epipolar Geometry F=[e]xH
49
Visual 3D Modeling using Cameras and Camera Networks
Let’s just sample: RANSAC
• Repeat
– Generate random hypothesis for epipoles
– Compute epipolar geometry
– Verify hypothesis and count inliers (use conservative threshold, e.g. 5 pixels,
until satisfying hypothesis
• Refine hypothesis
but abort early if not promising)
– minimize symmetric transfer error of frontier points
– include more inliers (use strict threshold, e.g. 1 pixels)
Until error and inliers stable
We’ll need an efficient representation
as we are likely to have to do many trials!
50
Visual 3D Modeling using Cameras and Camera Networks
A Compact Representation for Silhouettes
Tangent Envelopes
•
Convex Hull of Silhouette.
•
Tangency Points
for a discrete set of angles.
•
Approx. 500 bytes/frame. Hence a whole video sequences easily fits in
memory.
Tangency Computations are efficient.
•
51
Visual 3D Modeling using Cameras and Camera Networks
Epipole Hypothesis and Computing H
52
Visual 3D Modeling using Cameras and Camera Networks
Model Verification
53
Visual 3D Modeling using Cameras and Camera Networks
Remarks
• RANSAC allows efficient exploration of 4D parameter space
(i.e. epipole pair) while being robust to imperfect silhouettes
• Select key-frames to avoid having too many identical
constraints (when silhouette is static)
54
Visual 3D Modeling using Cameras and Camera Networks
Reprojection Error and Epipole Hypothesis Distribution
40 best hypothesis out of 30000
Residual Distribution
– Hypotheses along y-axis
– Sorted Residuals along x-axis.
– Pixel Error along z-axis.
Typically, 1/5000 samples converges to global minima after
non-linear refinement (corresponds to 15 sec. computation time)
55
Visual 3D Modeling using Cameras and Camera Networks
Computed Fundamental Matrices
56
Visual 3D Modeling using Cameras and Camera Networks
Computed Fundamental Matrices
F computed directly (black epipolar lines)
F after consistent 3D reconstruction (color)
57
Visual 3D Modeling using Cameras and Camera Networks
Computed Fundamental Matrices
F computed directly (black epipolar lines)
F after consistent 3D reconstruction (color)
58
Visual 3D Modeling using Cameras and Camera Networks
From epipolar geometry to full calibration
• Not trivial because only matches between two views
• Approach similar to Levi et al. CVPR’03, but practical
• Key step is to solve for camera triplet
(v is 4-vector )
(also linear in v)
Choose P3 corresponding to
closest
• Assemble complete camera network
• projective bundle, self-calibration, metric bundle
59
Visual 3D Modeling using Cameras and Camera Networks
Experiment
4 video sequences
at 30 fps.
All F Matrices
computed
from silhouettes
Full calibration
computed
60
Visual 3D Modeling using Cameras and Camera Networks
Metric Cameras and Visual-Hull Reconstruction from 4 views
Final calibration quality comparable to explicit calibration procedure
61
Visual 3D Modeling using Cameras and Camera Networks
What if the videos are unsynchronized?
For videos recorded at a constant framerate, same
contraints are valid, up to some extra unknown
temporal offsets
62
Visual 3D Modeling using Cameras and Camera Networks
Synchronization and calibration from silhouettes
(Sinha and Pollefeys, submitted)
• Add a random temporal offset to RANSAC hypothesis
generation, sample more
• Use multi-resolution approach:
– Key-frames with slow motion, rough synchronization
– Add key-frames with faster motion, refine synchronization
63
Visual 3D Modeling using Cameras and Camera Networks
Synchronization experiment
• Total temporal offset search range [-500,+500] (i.e. ±15s)
• Unique peaks for correct offsets
• Possibility for sub-frame synchronization
64
Visual 3D Modeling using Cameras and Camera Networks
Synchronize camera network
• Consider oriented graph with offsets as branch value
• For consistency loops should add up to zero
2
2
• MLE by minimizing   t  t


in frames (=1/30s)
+3
+8
-5
0
+6
+2
ground truth
65
Visual 3D Modeling using Cameras and Camera Networks
Towards active camera networks
• Provide much more flexibility by making use of pantilt-zoom range, networked cameras
• (maintaining) calibration is a challenge
up to 3Gpix!
66
Visual 3D Modeling using Cameras and Camera Networks
Calibration of PTZ cameras
similar to
Collins and Tsin ’99,
but with varying
radial distortion
67
Visual 3D Modeling using Cameras and Camera Networks
68
Visual 3D Modeling using Cameras and Camera Networks
Conclusion
• 3D models from video, more flexibility, more general
• Camera networks synchronization and calibration,
just from silhouettes, great for visual-hull systems
Future plans
• Deal with sub-frame offset for VH reconstruction
• Extend to active camera network (PTZ cameras)
• Extend to asynchronous video streams (IP cameras)
69
view01
Visual 3D Modeling using Cameras and Camera Networks
Acknowledgment
•
•
NSF Career, NSF ITR on 3D-TV, DARPA seedling, Link foundation
EU ACTS VANGUARD, ITEA BEYOND, EU IST MURALE, FWO-Vlaanderen
•
Sudipta Sinha, Ruigang Yang, Seon Joo Kim, Andrew Raij,
Greg Welch, Leonard McMillan (UNC)
Maarten Vergauwen, Frank Verbiest, Kurt Cornelis, Jan Tops,
Luc Van Gool (KULeuven), Reinhard Koch (UKiel), Benno Heigl
•
70
Visual 3D Modeling using Cameras and Camera Networks
Related documents
Download