Document 13546923

 Intro  Related work  Approach  Program results  Future work  Goals › To aid in my research for my thesis “Markerless Indoor Localization for the Mobile Environment” › Gain a better understanding of feature descriptors and image matching  Challenges of image matching in the indoor environment › Local Image Texture Features (LITFs) are sparse and tend to be clustered › As LITFs become sparse they become unreliable  Image matching has three related areas of work that need to be considered: › Detecting Image features › Matching feature points › Image matching    Image features are patches in an image that can be found consistently LITF based features are the most commonly used image features: › Scale Invariant Feature Transform (SIFT) [1] › Speeded Up Robust Feature (SURF) [2] › Oriented FAST Rotated BRIEF (ORB) [3] Line segment features › Typically calculated by Hough transform › Used to find line intersections or vanishing points in images [1] David G. Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. International Journal of Computer Vision 2004. [2] Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. "Surf: Speeded up robust features." Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 404-417. [3] Rublee, Ethan, et al. "ORB: an efficient alternative to SIFT or SURF." Computer Vision (ICCV), 2011 IEEE International Conferensce on. IEEE, 2011. SIFT and SURF are the most commonly used LITF  Robust to noise, scale, intensity variations, some affine deformations and occlusions  SIFT: › Creates a 4x4 descriptor around the key point › Has been modified many ways one of the most commonly used is PCA-SIFT [1]  SURF: › It is faster to compute than SIFT [2,3]  Both SIFT and SURF, however, are computationally expensive and their descriptors require a large amount of memory to store  [1] Yan Ke and Rahul Sukthankar, “PCA-SIFT: a more distinctive representation for local image descriptors,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-517 (2004) [2] Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. "Surf: Speeded up robust features." Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 404-417. [3] Heinly, Jared, Enrique Dunn, and Jan-Michael Frahm. "Comparative evaluation of binary features." Computer Vision–ECCV 2012. Springer Berlin Heidelberg, 2012. 759-773.      Computationally less expensive than SIFT and SURF by one to two orders of magnitude [1,2] Requires less space than SIFT and SURF descriptors [2] Is robust to most image deformations that SIFT and SURF are [1,2] Has been run on a 7Hz video stream on a mobile phone (1GHz ARM processor, 512 Mb RAM) [1] This makes it a promising LITF descriptor for my thesis [1] Rublee, Ethan, et al. "ORB: an efficient alternative to SIFT or SURF." Computer Vision (ICCV), 2011 IEEE International Conferensce on. IEEE, 2011. [2] Heinly, Jared, Enrique Dunn, and Jan-Michael Frahm. "Comparative evaluation of binary features." Computer Vision–ECCV 2012. Springer Berlin Heidelberg, 2012. 759-773.   Are typically extracted by the Hough transform [1,2] They are useful in many ways: › Extract planes in the image › Determining where lines in the image intersection › Line segment intersections are good features to match to the database  Can be used to determine cross ratio of 4 collinear points › Determining the vanishing point in the image, help determine camera pose › Can typically still be found in sparse LITF environments [1] Ballard, Dana H. "Generalizing the Hough transform to detect arbitrary shapes." Pattern recognition 13.2 (1981): 111-122. [2] Hough, Paul VC. "Method and means for recognizing complex patterns." U.S. Patent No. 3,069,654. 18 Dec. 1962.  Feature matching is an import step in image matching Determines the best matching features between a query image and a database of images  Two types of feature matching explored:  › Brute Force matching › Locality Sensitive Hashing (LSH) [1] [1] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. “Localitysensitive hashing scheme based on p-stable distributions”, in Proceedings of the 20th Annual Symposium on Computational Geometry, pp. 253-262 (2004) Most accurate form of feature matching  Exhaustively searches the database to find the best matches for query key points  Has a major flaw, it becomes prohibitively expensive as the database grows  It is only suitable for small databases (500-10000 feature points)  k Nearest Neighbor (kNN) approximation  Hashes keypoints in a way that preserves locality [1]  Since locality is preserved, distances between hashes is equivalent to distance between key points [1]  Is very efficient matching images against a large database  Only better than Brute Force matching when the database becomes large  [1] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. “Locality-sensitive hashing scheme based on p-stable distributions”, in Proceedings of the 20th Annual Symposium on Computational Geometry, pp. 253-262 (2004)   Given a set of feature matches to a database, the best image match needs to be found Problem is that there may be feature matches to multiple images in the database › Feature matchers have the potential to incorrectly match feature points  Need to determine the correct match: › One approach is visual Bag of Words , weak matching constraint › Fitting the matched features to a model, stronger matching constraint   Fitting matched features to a model is an effective and reliable way to determine image similarity There are two common models that are used: › Fundamental matrix › Homography  Models can also be customized to the task at hand › Hile et al [1] uses the floor plan of the building as the model they match the floor plane in the image to [1] Hile, Harlan, and Gaetano Borriello. "Positioning and orientation in indoor environments using camera phones." Computer Graphics and Applications, IEEE 28.4 (2008): 32-39. BF matching  LSH matching  Query image is matched to the database  Finding the correct image:  › Images with fewer than 𝑘 matches are filtered out › The fundamental matrix between the query image and each remaining db image is found › The db image with the largest number of inliers 𝑖 is selected › If 𝑖 ≥ 𝑡 then it is determined to be the correct image Match the query image to the database  Finding the correct image:  › First the images are filtered: if a db image has more than one key point matching a key point in the query image. The closest key point has to be 60% closer then the second closest point › All images with fewer then k matches are discarded › A fundamental matrix is formed between the query image and each remaining db image › The image with the best number of inliers 𝑖 ≥ 𝑡 is determined to be the best match The program was trained with 67 images of the third floor of Brown Building  Images of 4 doorways were taken from 5-6 perspectives with 2-3 steps between the perspectives  Images traveling down the long north to south hallway starting at each end were taken with 2-3 steps between each image    So far the prototype has only been written in C++ and has only been run on a laptop. But should be able to run on a mobile platform. It currently does not report false positives, but it does report false negatives when the texture becomes sparse (a) (b) (c) (a) (b) (c)  A notable result occurred when the 51 images of the long north south hallway where added › The initial matching phase’s accuracy improved noticeably. › This was unexpected because the BF and LSH matching algorithms don’t appear to use machine learning. › A possible reason for this is that the matchers have more data to use making them more accurate Adapt this work to my thesis  Implement the mobile version of matching process  Explore line segment features to see if they would work well in conjunction with LITF descriptors 

Document 13546923

Related documents

Products

Support

Document 13546923

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib