SCALABLE IMAGE MATCHING MID PROJECT REPORT David Strickland ENGN 256 Spring 2013 REFERENCE PAPER: INVERTED INDEX COMPRESSION FOR SCALABLE IMAGE MATCHING Chen, D.M.; Tsai, S.S.; Chandrasekhar, V.; Takacs, G.; Vedantham, R.; Grzeszczuk , R.; Girod, B., "Inverted Index Compression for Scalable Image Matching," Data Compression Conference (DCC), 2010 , vol., no., pp.525,525, 24 -26 March 2010 VOCABULARY TREE + INVERTED INDEX The Vocabulary Tree is a tree -structured vector quantizer constructed by hierarchical k-means clustering of feature descriptors. Inverted Index: Each node has two lists Image IDs Array of counts 1Image from Chen et al. PROCESS RECAP 1. 2. 3. 4. 5. Detect Features Extract Feature Locations and Descriptors Quantize Descriptors into a Vocabulary Tree Score Database Images using Inverted Index Pairwise Match on top scoring Images SCHEDULE I: VT/II IMPLEMENTATION Week 1: Research Vocabulary Tree / Inverted Index, Determine which libraries to use Week 2: Implement Feature Locator/Descriptors Week 3: Implement Quantization of Descriptors in V T Week 4: Implement Database scoring scheme using Inverted Index Week 5: Milestone: Mid Project Presentation, Combine Previous parts, Pairwise Match to retrieve a single image LIBRARY CHOICES VLFeat Includes hierarchical integer k means methods VLFeat is available for MATLAB or C Also includes SIFT and dense SIFT feature detection FreeImage C++ library that handles image input/output OpenSURF C++ SURF implementation OpenCV Required by OpenSURF Dirent.h Provides POSIX bindings for windows C++, useful for file I/O LANGUAGE CHOICE: MATLAB OR C++ MATLAB + I/O is simple + Integration is easy + VLFeat tutorials are all for the MATLAB bindings + Data is easy to handle, array manipulation is simple - Proprietary, would require MATLAB to use/modify - Not as fast as C++ C++ + No license required + Extremely fast, as you control everything - Integration is difficult (different data structure schemes) - You have to control everything Language Choice: C++ Builds character FEATURE DETECTION SURF C++: OpenSURF Requires OpenCV, which handles the image I/O SURF features used in the paper (Chen et al.) SIFT Dense SIFT feature detection Included in VLFeat library Same # of features for every image, but large # of features Analysis/Comparison of DTII results when using SURF features vs DSIFT features would provide new/useful information PAIRWISE IMAGE MATCHING Simple Nearest Neighbor algorithms can be applied to the features of the two images to calculate the number of matching images Whichever image has the highest fraction of matches is the best match Methods for this are included in some of the libraries The Dictionary Tree + Inverted Index scoring method will produce the similarity scores for all the database images The highest scoring images can then be scored using pairwise matching to find the best image match DICTIONARY TREE CREATION HIKM – hierarchical k means clustering is done via VLFeat’s library All the feature descriptors are used to create hierarchical clusters Creates the dictionary tree, each node is a cluster center 1Image from http://www.vlfeat.org/overview/hikm.html INVERTED INDEX CREATION Inverted Index The descriptors of each feature of each image traverse the tree to build the inverted index Each leaf node visited adds to the associated image’s array of counts Each leaf node has its own inverted index 1Image from Chen et al. SIMILARIT Y SCORING & MEMORY USAGE When matching an image, each image i k1 in the database of N images is given a similarity score For each node visited by query descriptors the node’s inverted list of images all have the scores incremented : Where: CURRENT PROGRESS Implemented: Image I/O, SURF feature detection Read/Write SURF features to file HIKM (Dictionary Tree) Creation Inverted Index Creation Image Scoring Methods Pairwise matching for SURF method Combined everything together for Image Matching via DTII To Do: Read/Write HIKM tree to file Read/Write Inverted Index to file Compression Add additional feature types (e.g. dsift) + analysis CHALLENGES File I/O needed to be handled manually in C++ Dif ferent libraries format data dif ferently Very little documentation available for some libraries Dictionary Tree & Inverted Index creation times are large Finding SURF features for every image is time consuming DT + II is very large, memory management is important Saving Information (variables etc.) to disk is non -trivial SCHEDULE II: COMPRESSION & ANALYSIS Week Week Week Week 6: 7: 8: 9: Inverted Index Image ID storage DSIFT feature version of the DT+II Soft Binned Tree, Analysis Final Project Presentation INVERTED INDEX COMPRESSION Encode each inverted index’s Image IDs by consecutive dif ferences Inverted index compression techniques can significantly reduce memory usage by up to 5X without any loss in recognition accuracy Reorder database to minimize dif ferences Minimize: SOFT-BINNED FEATURE DESCRIPTOR HISTOGRAMS Classify a feature descriptor to k nearest tree nodes instead of just nearest tree node Soft-binned tree gives improvement in classification accuracy Disadvantage: Each database feature now appears in k different inverted lists Results in larger lists REFERENCES 1 David M. Chen, Sam S. Tsai, Vijay Chandrasekhar, Gabriel Takacs, Ramakrishna Vedantham, Radek Grzeszczuk, Bernd Girod, “Inverted Index Compression for Scalable Image Matching”