Mid Project Presentation

advertisement
SCALABLE IMAGE MATCHING
MID PROJECT REPORT
David
Strickland
ENGN 256
Spring 2013
REFERENCE PAPER: INVERTED INDEX COMPRESSION
FOR SCALABLE IMAGE MATCHING
 Chen, D.M.; Tsai, S.S.; Chandrasekhar, V.; Takacs, G.;
Vedantham, R.; Grzeszczuk , R.; Girod, B., "Inverted Index
Compression for Scalable Image Matching," Data Compression
Conference (DCC), 2010 , vol., no., pp.525,525, 24 -26 March
2010
VOCABULARY TREE + INVERTED INDEX
 The Vocabulary Tree is a tree -structured vector quantizer
constructed by hierarchical k-means clustering of feature
descriptors.
 Inverted Index: Each node has two lists
 Image IDs
 Array of counts
1Image
from Chen et al.
PROCESS RECAP
1.
2.
3.
4.
5.
Detect Features
Extract Feature Locations and Descriptors
Quantize Descriptors into a Vocabulary Tree
Score Database Images using Inverted Index
Pairwise Match on top scoring Images
SCHEDULE I: VT/II IMPLEMENTATION
 Week 1: Research Vocabulary Tree / Inverted Index,
Determine which libraries to use
 Week 2: Implement Feature Locator/Descriptors
 Week 3: Implement Quantization of Descriptors in V T
 Week 4: Implement Database scoring scheme using Inverted
Index
 Week 5: Milestone: Mid Project Presentation, Combine
Previous parts, Pairwise Match to retrieve a single image
LIBRARY CHOICES
 VLFeat
 Includes hierarchical integer k means methods
 VLFeat is available for MATLAB or C
 Also includes SIFT and dense SIFT feature detection
 FreeImage
 C++ library that handles image input/output
 OpenSURF
 C++ SURF implementation
 OpenCV
 Required by OpenSURF
 Dirent.h
 Provides POSIX bindings for windows C++, useful for file I/O
LANGUAGE CHOICE: MATLAB OR C++
 MATLAB






+ I/O is simple
+ Integration is easy
+ VLFeat tutorials are all for the MATLAB bindings
+ Data is easy to handle, array manipulation is simple
- Proprietary, would require MATLAB to use/modify
- Not as fast as C++
 C++




+ No license required
+ Extremely fast, as you control everything
- Integration is difficult (different data structure schemes)
- You have to control everything
 Language Choice: C++
 Builds character
FEATURE DETECTION
 SURF
 C++: OpenSURF
 Requires OpenCV, which handles the image I/O
 SURF features used in the paper (Chen et al.)
 SIFT
 Dense SIFT feature detection
 Included in VLFeat library
 Same # of features for every image, but large # of features
 Analysis/Comparison of DTII results when using SURF features
vs DSIFT features would provide new/useful information
PAIRWISE IMAGE MATCHING
 Simple Nearest Neighbor algorithms can be applied to the
features of the two images to calculate the number of
matching images
 Whichever image has the highest fraction of matches is the
best match
 Methods for this are included in some of the libraries
 The Dictionary Tree + Inverted Index scoring method will
produce the similarity scores for all the database images
 The highest scoring images can then be scored using pairwise
matching to find the best image match
DICTIONARY TREE CREATION
 HIKM – hierarchical k means clustering is done via VLFeat’s
library
 All the feature descriptors are used to create hierarchical clusters
 Creates the dictionary tree, each node is a cluster center
1Image
from http://www.vlfeat.org/overview/hikm.html
INVERTED INDEX CREATION
 Inverted Index
 The descriptors of each feature of each image traverse the tree to
build the inverted index
 Each leaf node visited adds to the associated image’s array of counts
 Each leaf node has its own inverted index
1Image
from Chen et al.
SIMILARIT Y SCORING & MEMORY USAGE
 When matching an image, each image i k1 in the database
of N images is given a similarity score
 For each node visited by query descriptors the node’s
inverted list of images all have the scores incremented :
Where:
CURRENT PROGRESS
 Implemented:







Image I/O, SURF feature detection
Read/Write SURF features to file
HIKM (Dictionary Tree) Creation
Inverted Index Creation
Image Scoring Methods
Pairwise matching for SURF method
Combined everything together for Image Matching via DTII
 To Do:




Read/Write HIKM tree to file
Read/Write Inverted Index to file
Compression
Add additional feature types (e.g. dsift) + analysis
CHALLENGES




File I/O needed to be handled manually in C++
Dif ferent libraries format data dif ferently
Very little documentation available for some libraries
Dictionary Tree & Inverted Index creation times are large
 Finding SURF features for every image is time consuming
 DT + II is very large, memory management is important
 Saving Information (variables etc.) to disk is non -trivial
SCHEDULE II: COMPRESSION & ANALYSIS




Week
Week
Week
Week
6:
7:
8:
9:
Inverted Index Image ID storage
DSIFT feature version of the DT+II
Soft Binned Tree, Analysis
Final Project Presentation
INVERTED INDEX COMPRESSION
 Encode each inverted index’s Image IDs by consecutive
dif ferences
 Inverted index compression techniques can significantly reduce
memory usage by up to 5X without any loss in recognition accuracy
 Reorder database to minimize dif ferences
 Minimize:
SOFT-BINNED FEATURE DESCRIPTOR
HISTOGRAMS
 Classify a feature descriptor to k nearest tree nodes instead
of just nearest tree node
 Soft-binned tree gives improvement in classification accuracy
 Disadvantage:
 Each database feature now appears in k different inverted lists
 Results in larger lists
REFERENCES
 1 David M. Chen, Sam S. Tsai, Vijay Chandrasekhar, Gabriel
Takacs, Ramakrishna Vedantham, Radek Grzeszczuk, Bernd
Girod, “Inverted Index Compression for Scalable Image
Matching”
Download