Elias Benussi – Monocular Slam Project: personal contribution DTAM Dense Parallel Tracking and Mapping INTRODUCTION DTAM is a recent implementation of Monocular SLAM. DTAM stands for Dense Tracking and Mapping, due to the way the algorithm uses mutual dependent tracking and mapping in order to localise the position of the camera in the space and to map the surrounding environment. As for the PTAM, the algorithm is self sustaining, due to the recursive way in which it handles mapping and tracking, i.e. at each stage the calculation only needs the raw data just obtained from the camera and the output from the previous stage [3.1]. This, and the improvement of the supporting hardware, allows its use in real time simulations. The improvement with respect to the PTAM originates from the dense mapping technique used. The mapping algorithm in fact doesn’t just rely on simple triangulation and stereo initialisation techniques, but also creates an inverse depth 3D mapping of the surfaces observed and then feeds it to the tracking system [3.1]. This system allows accurate and detailed mapping and simulation of the environment and improves the weaknesses of point based systems, e.g. related to blur of the image due to rapid motion [3.1]. Furthermore, before the calculations are approved to be fed back into the algorithm, the photoconsistency of the data is checked based on previous depth estimations and given a reliability rating [3.1]. This diminishes the importance or random error or imprecisions. The tracking process tracks the movement of the camera so that the algorithms is always aware of the current relative position of the camera. INVERSE DEPTH MAPPING Inverse depth mapping is a technique used in slam simulations which reduces delays and uncertainties in parametrising points. Also this method uses the Extended Kalman Filter standard [3.3]. The key concept is that the parametrisation is done using as reference the point where the location was first observed from. This allows a great deal of precision even for features with low parallax [3.4]. The parallax angle is a key concept in measurements of distances and it is used in many different fields. A common application is in astronomy, to measure the distance of stars from Earth. In simple words, the parallax angle would be half the angle a telescope would need to be rotated by, in order to observe a star from a particular location on Earth in the two moments of maximum displacement (that is when Earth, Sun and the star under consideration form a right angle). In the picture below this is represented by p'' Image source: [3.5] Inverse depth mappings have two main advantages: [3.4]: • The algorithm is undelayed, meaning that features are immediately used to improve estimates, regardless their weights [3.4]. • Even though the initial frames are used as a reference, these are continuously updated thus increasing accuracy. One downside is that it needs a 6-D representation rather than a classic 3-D XYZ Eucledian one. [3.3]. This obviously contributes to make the algorithm for the DTAM more computationally complex. DENSE MAPPING The dense mapping part of the algorithm can be divided into three substeps [3.2]. 1 First thing is to calculate the photoconsistency of the data just collected with respect to the reference. The latter is calculated initially (first step of the algorithm) on the initial measurements via a PTAM like method. This step is skipped in later calls of the algorithm. The photoconsistency is a way to give a weight of importance to the newly gathered data, so that they influence the simulation only if consistent enough with what is known already. What is problematic in this filtering process is that the system is not linear, and to linearise it is computationally very expensive. To avoid this approximations are made, introducing new variables whose value can be optimised via heuristic research [3.1]. 2 The next step is to create (and subsequently update) an inverse depth map, as described above. 3 Next the information is put together to improve the capability of the algorithm to make accurate predictions for future data based on the movements of the camera handled during tracking. Image source: [3.2] DENSE TRACKING As mentioned above, the tracking process must both localise the camera and associate with the position the change in the image of the environment. This is done in two steps: estimate and refinement. • Estimating the pose of the camera in real time is done by simulating a motion that matches with the live video model and then extrapolating the best fitting parameters of motion in a process called constrained inter-frame rotation estimation [3.1]. This uses the results given by the mapping in order to assess the reliability of the estimate. • The refinement of the calculation relies on an accurate 6DOF (6 degrees of freedom) full pose refinement, again against the live model offered by the mapping [3.1]. This process is more accurate but less stable and is thus performed second. CONCLUSION As illustrated above, the algorithm is very performant in terms of accuracy, and gives much more accurate results than previous methods. Ironically this procedure would have been considered disadvantagious in the past since it needs to perform texture-mapping of the scene, which is modelled by millions of vertices. This is itself composed by depth maps built from bundles of frames. All this makes it a very computationally expensive algorithm. Nowadays though, with the powerful hardware disposable [3.1], this is not an issue anymore and the fact that it gives better accuracy has been exploited even in real time simulations. PARTICLE FILTER SLAM Global approach for density function INTRODUCTION Most previous SLAM techniques made use of a Bayesian framework, but in applications in the real world two main issues occur. Most systems are dynamic and thus often need nonlinear models for process and measurement. This creates some problems. Moreover the noise resulting from measurements and processing can be non-gaussian. In these conditions a normal Kalman Filter tends to perform quite poorly [3.6]. This led to the creation of more sophisticated algorithms. Non linear filters can be classified into two categories. The first one uses a local approach in approximating the probability density function of the sampling particles. The most prevalent example, which is mentioned several times in other pages, is the Extended Kalman Filter (EKF) [3.6]. The other approach is a global approach, which is what the Particle Filter method uses. It works by approximating the posterior density function by some particular form [3.6]. As its local equivalents, it bases its great accuracy on modelling a large number of samples which makes it quite computationally expensive. OVERVIEW The implementation was very convenient at the time of invention, because it allowed to get rid of costly depth mappings. Nowadays however, due to more powerful hardware this is not necessarily true anymore. During the implementation of an adaptive particle filter there are two main factors to keep in mind[3.6]: • During the selection process care must be taken that not too many samples with a low weight are ignored. This is done by means of a likelihood function which being very easily calculated, proves to be an efficient solution [3.7]. The reweighting process is shown below in image b. Image source: [3.7] • The design of a distribution that facilitates predictive sampling in order to achieve a sufficient overlap with the true state density function. [3.6] • Image source: [3.7] • At each step of the algorithm the value of correlation is kept high to obtain a sufficient number of weighted particles [3.7]. • However, since an excessively large number of samples (or value for the threshold of correlation ε [3.7]) would slow down the algorithm considerably, a process of particle annealing is performed: particles are iteratively focused onto potential modes and in the meantime the value of ε is reduced. CONCLUSION This model performs differently from local approaches. It all depends on the noise value in the measurements. If this value is lower than a certain threshold then a local approach, like those based on the EKF will perform better than this model [3.6]. However with very noisy systems this global approach will yield to better accuracy, thus making it useful in some real life applications where measurements are hard to take. Although nowadays this system seems obsolete as the performant hardware we possess allows us to make use of complex depth and texture maps, at the time of creation this algorithm was a optimal solution for real time simulations. CREDITS AND REFERENCES REFERENCES [3.1] Newcombe, Richard A.; Lovegrove, S.J.; Davison, A.J., "DTAM: Dense tracking and mapping in real-time," Computer Vision (ICCV), 2011 IEEE International Conference on , vol., no., pp.2320,2327, 6-13 Nov. 2011 [3.2] DTAM slides http://www.slideserve.com/lorie/dtam-dense-tracking-andmapping-in-real-time-newcombe-lovegrove-davison-iccv11 (Accessed 17.03.2015) [3.3] Civera, J.; Davison, A.J.; Montiel, J., "Inverse Depth Parametrization for Monocular SLAM," Robotics, IEEE Transactions on , vol.24, no.5, pp.932,945, Oct. 2008 [3.4] Inverse depth slides. http://cms.brookes.ac.uk/research/visiongroup/talks/montiel/InverseDepthMon ocularSLAM.pdf (Accessed 17.03.2015) [3.5] Parallax angle diagram. http://www.thunderbolts.info/eg_draft/images/parallax_566x304.jpg (Accessed 25.02.2015) [3.6] Songlin Piao, Adaptive Particle Filter based on the Kurtosis of Distribution, Master's thesis, Hanyang Universty Graduate School, February 2011 [3.7] Pupilli, M.; Calway, A., "Real-Time Camera Tracking Using Known 3D Models and a Particle Filter," Pattern Recognition, 2006. ICPR 2006. 18th International Conference on , vol.1, no., pp.199,203, 0-0 0 CREDITS Bootstrap based theme from http://startbootstrap.com/