Group Sparse Coding

Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July-23-2010 *Figures and formulae are directly copied from the original paper Outline • • • • Introduction Group Coding Dictionary Learning Results and Discussion Introduction • Bag-of-words document representations – Encode document by a vector of the counts of descriptors (words) – Widely used in text, image, and video processing • Easy to determine a suitable word dictionary for text documents. • For images and videos – No simple mapping from the raw document to descriptor counts – Require visual descriptors (color, texture, angles, and shapes) extraction – Measure descriptors at appropriate locations (regular grids, special interest points, multiple scales) – More carful design of dictionary is needed Dictionary Construction • Unsupervised vector quantization (VQ), often kmeans clustering – Pro: maximally sparse per descriptor occurrence – Cons: • Does not guarantee sparse coding whole image • Not robust w.r.to descriptor variability • regularized optimization – Encode each visual descriptor as a weighted sum of dictionary elements • Mixed-norm regularizers – Take into account the structure of bags of visual descriptors in images – Presenting sets of images from a given category Problem Statement • The main goal : encode groups of instances (e.g. image patches) in terms of dictionary code words (some kind of average patches) • Notations – The m’th group – the subscript m is removed for single group operation. • Sub goals – Encoding ( ) – Learning a good dictionary from a set of training groups Group Coding • Given and , group coding is achieved by solving where –. – is the – balances fidelity and reconstruction complexity. • Coordinate descent is applied to solve the above problem. • Finally, compress into a single vector by taking p-norm of each . Group coding • Define • Optimum for p=1 • Optimum for p=2 Dictionary Learning • Good Dictionary should balances between – Reconstruction error – Reconstruction complexity – Overall complexity relative to the given training set • Seeking learning method facilitates both – induction of new dictionary words – removal of dictionary words that have low predictive power • Applying • Let • Objective Dictionary Learning • In this paper p=2 • Define auxiliary variables • Define vector (appearing in the gradient of objective function) • Similar to the argument in group coding, one can obtain Experimental Setting • Compare with previous sparse coding method by measuring impact on classification the PASCAL VOC (Visual Object Classes) 2007 dataset – image from 20 classes, including people, animals, vehicles and indoor objects etc. – around 2500 images for respective training and validation; 5000 images for testing. • Extract local descriptors based on Gabor wavelet response at – Four orientations ( ) – Spatial scales and offsets (27 combination) • The 27 (scale, offset) pairs were chosen by optimizing a previous image recognition task, unrelated to this paper. Results and Discussion Results and Discussion Results and Discussion

Group Sparse Coding

Related documents

Products

Support

Group Sparse Coding

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib