Lior Wolf and Noga Levy The SVM-minus Similarity Score for Video Face Recognition 24.06.2013 Makarand Tapaswi CVPR Reading Group @ VGG 1 Same / Not Same ? 2 One liner “How similar is the face in one video sequence to the other, where the similarity is uncorrelated with pose-induced similarity” • illumination, expression, image quality, pose • classifier should – discriminate positive/negative AND – uncorrelate w.r.to additional feature set 3 Basic Notation • π1 • π2 • π΅: the background (negative) set 5 Matched Background Similarity same person B1 X1 B X2 B2 8 MBGS different persons B1 B2 X1 X2 B 9 MBGS 10 SVM-minus Classifier • Inputs – Training set {π₯π } – Privileged info. π₯π′ – Labels π¦π + + + + + - - - - - 11 SVM-minus classifier • π′ = train π ′ , π¦ • π = test(π ′ , π′) signed dist. to hyperplane • Un-correlate π with π€ π π 12 SVM-minus Classifier (2) • Split – π into ππ (π¦π = +1) and ππ (π¦π = −1) – π into ππ and ππ – necessary since classifiers are correlated • Normalization – ππ and ππ each feature-dimension to 0 mean – ππ and ππ to mean 0, and π ππ = π ππ = 1 13 SVM– loss function • Pearson correlation coefficient P π€ π ππ , ππ π€ π ππ ππ = π(π€ π ππ ) • Convexity: ignore denom. and square num. 14 Reduce to standard SVM Can be reduced to standard SVM in the dual form. In the dual πΌ, and πΌπ¦ signed by π¦ 15 Projection Matrix • ππ ≥ 0, ππ ≥ 0 thus π΄ is positive definite • π΄−1 is p.d., Cholesky decomposition π΄−1 = πΏπΏπ 16 SVM-minus Similarity same person Pan angle • Cancel influence from pose +ve scoring poses need not be same person –ve scoring poses need not be different person One-Side SVM-minus Similarity 18 SVM-minus Similarity • Use one-side SVM-minus for online tasks 19 YouTube Faces DB • DB from [36] • Video LFW – – – – 3,425 videos; 1,595 people ~2.15 videos / person min-duration: 48 frames Average-clip length: 181.3 frames • Evaluation – – – – 5000 pairs 10 fold cross-validation 250+, 250– Person exclusive splits (person appears only in one split) 20 Experimental info • Detect face, expand bbox, align, resize 100x100 • Extract features – LBP – Center-Symm. LBP – Four-Patch LBP • 3D head orientation (π′) from face.com API* • πΆ = 250 • ππ = ππ = 1 21 MBGS Results from [36] Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with Matched Background Similarity. CVPR 2011. 22 This paper results Results where SVM– did most better than MBGS 23 Results • MBGS > SVM– at Accuracy • but, MBGS + SVM– wins • Combination done by stacking – learning yet another SVM for the 2D scores 24 Is it really useful? • • • • Combined score “statistically significant” for [FP]LBP Use entire background set, AUC: 83.6% to 79.9% Online applications (one-side), AUC: 83.6% to 81.9% Correlations: – Within method higher, different scores – Across methods, highest for same feature (as expected) 25 Conclusion • SVM– : unlearn using additional features • MBGS : be choosy about the negative set • 3D Pose : a good “privileged” information source • They don’t talk about pose estimation accuracy • Different types of privileged info that might work • Metric learning (and relatives) not compared Thank You! 26 Some more results from other sources YouTube Faces DB Ref Method Accuracy ± SE AUC EER [1] MBGS L2 mean, LBP 76.4 ± 1.8 82.6 25.3 [2] MBGS+SVM- 78.9 ±1.9 86.9 21.2 [3] APEM-FUSION 79.1 ±1.5 86.6 21.4 [4] STFRD+PMML 79.5 ±2.5 88.6 19.9 References: [1] Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with Matched Background Similarity. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011. [2] Lior Wolf and Noga Levy. The SVM-minus Similarity Score for Video Face Recognition. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013. [3] Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang. Probabilistic Elastic Matching for Pose Variant Face Verification. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013. [4] Zhen Cui, Wen Li, Dong Xu, Shiguang Shan and Xilin Chen. Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013. 27