Tracking and Data Fusion for 4D Visualization

advertisement
Tracking and Data Fusion for
4D Visualization
Ulrich Neumann
Suya You
Computer Science Department
Integrated Media Systems Center
University of Southern California
MURI Review
June 2002
Research Goals
Combine all manner of 3D models, images, video, and data in
a coherent dynamic visualization to support spatio-temporal
data understanding, information extraction, and dynamic
scene change detection
Needs:
LiDAR
Laser
Stereo
…
Satellite
Aerial
Ground
Text
Hyper-links
Communications
…
…
Data Fusion
Tracking
3D model
Video/image
Data
Model refinement
Information Fusion & Visualization
Research Highlights
We pursue basic algorithm research and testbed implementations
that are feasible with current or near term technology
Robust tracking for outdoor unprepared environment
– Portable hybrid tracking and data acquisition/fusion system
– Natural feature tracking using vision sensors
Fusion of 2D video/images and 3D model
– LiDAR data tessellation and model reconstruction
– Real-time video texture projection and visualization
6DOF Auto-calibration technology
– Detect and calibrate scene features (points and lines) to refine
models and aid in tracking
Tracking in Unprepared Environments
• People with sensors (or unmanned sensors) moving in
environment provide textures and data for visualizations…
Where are they? Where are they looking?
• Need 6DOF pose tracking over wide area outdoors
– Varying sensor data availability and data rates
• vision, GPS, inertial sensors
– Varying certainty of measurements
• spatial and temporal noise and precision
– Fusion models and algorithms
• underdetermined system needs constraints
• real-time acquisition and execution on portable systems
• Developed two tracking systems
– Portable hybrid tracking and real-time data acquisition system
– Natural feature tracking for computing motion of video sensor
Portable Acquisition System
DGPS
receiver
3DOF Gyro
sensor
Stereo
camera head
Com1
Data
fusion
Com2
and
storage
Firewire
System configuration:
•
•
•
•
RTK differential GPS (Ashtech Z-Sensor base/mobile)
3D inertial tracker (Intersense IS300)
Real-time stereo head (SRI International)
PIII 866Hz laptop
Real-Time Data Acquisition
• Hybrid DGPS and
Inertial sensor provide
real-time 6DOF pose
tracking
• High-resolution digital
camera pairs are used
to capture video
streams for texture
projection and façade
reconstruction
• Complete self-contained system in a backpack
• Acquisition in real-time (all data streams are timestamped and synchronized)
• Includes data capture and playback tools
Natural Feature Tracking
Using Vision Sensor
Problem
Most vision tracking methods require a priori knowledge about the environment
• Pre-calibrated landmarks
• Active sensors
• Scene models
Active control or modification of an outdoor environment is unrealistic
Our approach
• Detect and use naturally occurred features
• Robust tracking 1D (point) and 2D feature (region)
• SFM (structure from motion) from tracked features
Neither camera ego-motion nor structure information is known
Natural Feature Tracking
Using Vision Sensor
2D Feature
tracking
Pose & Structure
Estimate
Structure
Pose
Video
streams
2D Feature
detection
(new feature)
Feature
verification
Feature list
Approach provides camera pose and structure estimates
• Related pose tracking can be directly used for augmented reality overlays
• Structure estimates allows continually tracking
• also can be used to improve/refine existing models
• Framework allows further sensor fusion (GPS, gyroscopes) for absolute
pose reconstruction
Natural Feature
Tracking
Vision Tracking Used for AR Overlays
Fusion of 2D video/image and 3D model
Imagine dozens of video streams from people, UAVs,
and sensors distributed and moving through scene….
• Use sensor models and 3D models of the scene to integrate
video/image data from different sources
• Visualize the imagery in unique, innovative ways that
maximize information extraction and comprehension
• Produce dynamic visualizations from arbitrary viewpoints
- Static textures pre-compute mapping of images to models
- dynamic projections onto models – like “slide projectors”
3D Model: LiDAR from flyover
• LiDAR provides 3D accurate (sub-meter)
position samples and cm height accuracy
- Use as base model/context for video
visualization
• Raw LiDAR comes as a 3D point-cloud
- Need tools for data resampling, tessellation,
and model reconstruction
Data Tessellation and
Model Reconstruction
• Data tessellation
– Data re-sampling (irregular sample cloud to regular grid)
– Surface interpolation (hole filling)
• 3D models represented as triangle meshes
- Easily converted to many other geometric representations
- Supports many level-of-detail techniques
- Easily add photometric information (texture projections)
- Hardware acceleration for fast image rendering
•
Use VRML as the standard model representation
- supports web applications and open tool-base
Data Resampling and
Model Reconstruction
Resampled range image
Reconstructed 3D model
LIDAR data acquired for USC campus
Image/Video Texture Projection
 Texture Projection vs. Texture Map
– dynamic vs. static
– texture image and position both change each video frame
3D model
image texture
Virtual Texture Projector
• Dynamic view and
placement control
during visualization
• Update pose and
“paint” the scene each
frame
• HW texture
transformation during
rendering
(Video)
• Real-time
visualization with HW
accelerator
Dynamic Texture Projection
• Putting it all together…
− Accurate 3D models
− Accurate 3D sensor models
− calibration and 6DOF tracking
− Projection transformation computes texture
mapping during rendering
− Visibility and occlusion processing
− multi-pass algorithm ensure that only visible surfaces
are textured
video texture projected on USC model (GPS/inertial tracking)
video texture projected on USC model (vision/GPS/inertial tracking)
video texture projected on USC LiDAR/model (vision/GPS/inertial)
LiDAR/Projection Rendering System
• Visualizations from arbitrary viewpoints
• User control of viewpoint as well as image inclusion (blending,
and projection parameters)
• Multi-texture projectors simultaneously visualize many image
sources projected on the same model
• Hardware acceleration with G-Force graphics card with use of
pixel-shader features
• Visualization wall (8x10-foot tracked stereo)
6 DOF Autocalibration
• Autocalibration computes 3D scene information
during tracking
– allows tracking in regions beyond models or landmarks
– Provides the necessary scale factor data to create the 6th
DOF that is lacking from vision-only tracking
– Provides absolute pose data for stabilizing the multisensors data fusion
– Provided estimated 3D information of structure features
(point, line and edge) to improve/refine models
6 DOF Autocalibration
• Camera pose and features are estimated
simultaneously along the motion path
– Point features (developed in past 2-3 years)
– Line features (this past year)
• Pose estimate from 3D calibrated lines or points
• Autocalibration of 3D lines
– Unique line representation (minimal)
• Four parameters N1, N2  (x, y, 1) for a given (R1T1, R2T2) to
uniquely determine a 3D line L
– EKF based estimator
• Update per feature or measurement
• Adapt to different sample rate
Simulation Results
RMS error for camera orientations
degree
0.7
0.6
Simulation with
100 inch volume
and 50 lines
0.5
0.4
0.3
0.2
1340
1237
1134
1031
928
825
722
619
516
413
310
207
104
1
0.1
0
Frame Id
Reprojection errors for test points
pixel
0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
3
2.5
2
1.5
1
Frame Id
1353
1249
1145
1041
937
833
729
625
521
417
313
209
105
1309
1200
982
1091
Frame Id
873
764
655
546
437
328
219
110
1
0.5
0
1
m
RMS error for camera positions
Autocalibration Results
Tracked line features
are marked as green
Red lines are the
projections of autocalibrated lines
A virtual lamp and
chair are inserted into
the real scene based on
the estimated camera
pose
Future Plans
• Automate steps in data capture and fusion for
vision/gps/inertial tracking and visualization
– do 8-12 captures around 3-4 buildings on campus
– visualize streams simultaneously
– correspondences from video to LiDAR/models
• edges, windows, …
• Model refinement (w/Cal, GT)
– constrained autocalibration for estimating building features
• simplification and definition of edges, planar faces, windows, …
• Temporal data management (w/Cal, GT, UCSC, USyr)
– accumulation of persistent textures
– user management and temporal blending (update)
Dynamic Texture Projection
 Benefits and capabilities
 Real-time multi-source data fusion
– Models, imagery, video, maps ...
 Enhanced data understanding
– Dynamic controls of viewpoint as well as image
inclusion
– Rapid update to reflect most recent information
– Highly interactive real-time perspective view capability
 3D data editing using photogrammetric methods
– Enables reuse of models and incremental refinement
Projection on LiDAR Data
Sensor
Sensor
Image plane
View
frustum
Aerial view of projected image texture (campus of Purdue University)
Download