KinectFusion : Real-Time Dense Surface Mapping and Tracking IEEE International Symposium on Mixed and Augmented Reality 2011 Science and Technology Proceedings (Best paper reward) Target Noisy data Normal maps Greyscales Outline • • • • • • Introduction Motivation Background System diagram Experiment results Conclusion Introduction • Passive camera • Simultaneous localization and mapping (SLAM) • Structure from motion (SFM) – MonoSLAM [8] (ICCV 2003) – Parallel Tracking and Mapping [17] (ISMAR 2007) • Disparity – Depth model [26] (2010) • Pose of camera from Depth models [20] (ICCV 2011) Motivation • Active camera : Kinect sensor • Pose estimation from depth information • Real-time mapping – GPU Background- Camera sensor • Kinect Sensor – Infra-red light • Input Information – RGB image(1) – Raw depth data – Calibrated depth image(2) (1) (2) Background – Pose estimation • Depth maps from two views • Iterative closest points (ICP) [7] • Point-plane metric [5] ICP Background – Pose estimation • Projective data association algorithm [4] Background – Scene Representation • Volume of space • Signed distance function [7] System Diagram System Diagram Pre-defined parameter • Pose estimation with sensor camera • Raw depth map Rk Raw data • Calibrated depth image Rk(u) K Rk Rk(u) where and Surface Measurement • Reduce noise • Bilateral filter With bilateral filter Without bilateral filter Surface Measurement • Vertex map • Normal vector Define camera pose Camera frame k is transferred into the global frame System Diagram Surface Reconstruction : Operate environment L L L3 voxel reconstruction L Surface Reconstruction • Signed distance function Truncated Signed Distance Function -v +v Axis x sensor Surface Fk(p) +v 0 -v Axis x • Weighting running average • Dynamic object motion System Diagram Surface Prediction from Ray Casting • Store • Ray casting marches from +v to zero-crossing Corresponding ray Surface Prediction from Ray Casting • Speed-up – Ray skipping – Truncation distance Axis x sensor Surface System Diagram Sensor Pose Estimation • • • • Previous frame Current frame Assume small motion frame Fast projective data association algorithm – Initialized with previous frame pose where • Vertex correspondences where • Point-plane energy • For z > 0 • Modified equation where Experiment Results • Reconstruction resolution : 2563 • Test camera pose • kinect camera rotates and captures 560 frame over 19 seconds in turntable Experiment Results • Using every 8th frame Experiment Results : Processing time Pre-processing raw data, data-associations; pose optimisations; raycasting the surface prediction and surface measurement integration Demo Conclusion • Robust tracking of camera pose by all aligning all depth points • Parallel algorithms for both tracking and mapping Reference [8] A. J. Davison. Real-time simultaneous localization and mapping with a single camera. In Proceedings of the International Conference on Computer Vision (ICCV), 2003. [17] G. Klein and D. W. Murray. Parallel tracking and mapping for small AR workspaces. In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2007. [26] J. Stuehmer, S. Gumhold, and D. Cremers. Real-time dense geometry from a handheld camera. In Proceedings of the DAGM Symposium on Pattern Recognition, 2010. [20] R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. DTAM: Dense tracking and mapping in real-time. In Proceedings of the International Conference on Computer Vision (ICCV), 2011 [7] B. Curless and M. Levoy. A volumetric method for building complex models from range images. In ACM Transactions on Graphics (SIGGRAPH), 1996. [5] Y. Chen and G. Medioni. Object modeling by registration of multiple range images. Image and Vision Computing (IVC), 10(3):145–155, 1992. [4] G. Blais and M. D. Levine. Registering multiview range data to create 3D computer objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 17(8):820–824, 1995.