Noise Resistant Graph Ranking for Improved Web Image Search Wei Liu (Columbia), Yu-Gang Jiang (Columbia), Jiebo Luo (Kodak), and Shih-Fu Chang (Columbia) http://www.ee.columbia.edu/ln/dvmm/ Goal Spectral Filter = Progressive Sparse Eigenbase Fitting Propose a novel ranking framework that can handle query samples with noisy relevance labels. Apply to web image search re-ranking. The original highest ranked images are treated as pseudo queries, which are filtered (denoised) and used to re-rank a much larger result set. Graph Ranker = Personalized PageRank or Manifold Rank Spectral filter outputs the denoised label vector . Then the graph ranker (on a graph G(X,E,W)) gives the final ranking score vector f true positive sample outlier Personalized PageRank (Jing et al. PAMI’08) Manifold Rank (Zhou et al. NIPS’04) noisy query sample set Results on Fergus dataset: ranking precision at 0.15 recall of 7 object categories: airplane, cars (rear), face, guitar, leopard, motorbike, and wrist watch using top-50 ranked images as pseudo queries. Framework spectral filter graph ranker high density region pseudo visual query set search result set The multi-region assumption is that the true positive samples in the noisy query set yq (q up to 100) reside in multiple high-density regions (clusters). Find these regions from the low-frequency spectrum of the graph Laplacian of a k-NN graph built on the working set (n up to 1k) containing the images to be reranked. Each smooth eigenvector discloses a high-density region. Compute m lowest eigenvalues and corresponding eigenvectors . Given the noisy label vector defined on q(<<n) query samples, we formulate a sparse optimization problem Sparse Eigenbase Fitting to filter out the outliers: Method Google Mean Precision at 0.15 Recall 0.43 SVM LogReg TSI-pLSA LDA PPR MR (ICCV 07) (CVPR’10) (ICCV’07) (CVPR 10) (ICCV (ICCV’05) 05) (CVPR’08) (CVPR 08) (PAMI’08) (PAMI 08) (NIPS’04) (NIPS 04) 0.54 0.56 Query Set Precision 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 02 0.2 0.1 0 Google top-50 LabelDiag SpecFilter (dense eigenbases) SpecFilter (sparse eigenbases) image graph construction spectral decomposition … graph ranker spectral filter smooth eigenbasis filtered visual query set L1 norm Progressive fitting via alternating between a and yq (j=0,1,2…): Projected Gradient Descent true positive queries Rounding reranked search result list wliu@ee.columbia.edu outliers 0.69 Fergus airplane: top-50 images from Google. SF improves query set precision from 0.5 to 1. SF+MR improves re-ranking precision of MR f from 0.39 0 39 tto 0 0.86 86 att 0 0.15 15 recall. ll 0.91 0.54 0.56 SF+ PPR SF+ MR 0.76 0.80 Results on INRIA dataset: mean average precision (MAP) of ranking over 353 text queries with visual query sets of different sizes. MAP Top-20 Images Top-50 Images Top-100 Images 0.5699 Search Engine 0.6490 LogReg (v) 0.6730 LogReg (t+v) PPR (v) 0.6931 0.6909 MR (v) 0.6980 0.6946 0.6886 0.6904 SF+PPR (v) 0.7117 0.7135 0.7135 SF+MR (v) 0.7275 0.7358 0.7376 Fergus cars_rear: top-50 images from Google. SF improves query set precision from 0.48 to 1. SF+MR improves re-ranking precision of MR f from 0.53 0 53 tto 1 att 0 0.15 15 recall. ll