A face analysis exemplar: Face detection, landmarking and facial expression recognition. Dr. Brais Martinez Slides can be downloaded from braismartinez.com Overview Model-free part-based tracking Part-based facial landmarking Face Analysis PostDoc PhD End 2010 Research visits to: • • Imperial College London (Maja Pantic) 9/2007-3/2008 Oregon State University (Sinisa Todorovic) 7/2013-10/2013 Overview Multi-view face detection Facial Landmarking Facial Action Unit Detection [Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical analysis of cascade deformable models for multi-view face detection” [IF 2012: 1.96, Q1] 2010 CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial point detection using boosted regression and graph models” 2013 TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local evidence aggregation in regression-based facial point detection” [Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance” 2014 TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling” [Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic analysis of facial actions: A survey” [Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision level fusion of domain specific regions for facial action recognition” Face detection using cascaded DPM Part-based model The Deformable Parts Model (DPM): • Object composed of parts • Current state-of-the-art model in object detection • Weakly-supervised • Uses Linear SVM (we used 35k+ training images!) • Very efficient implementations (both training and testing) 𝑛 Score 𝑝0 , … 𝑝𝑛 = 𝑝0 object loc. 𝑝𝑖 parts loc. 𝑛 𝐹𝑖 𝜙 𝐻, 𝑝𝑖 − 𝑖=0 Convolve filter 𝑖 with gradient im. 𝑑𝑖 𝜙𝑑 𝑑𝑥𝑖 , 𝑑𝑦𝑖 + 𝑏 𝑖=1 Penalise deformations Cascaded DPM Non-frontal poses: Mixture model Root Filter Part Filters Speed: cascaded search Part Locations 𝑛 𝑛 𝐹𝑖 𝜙 𝐻, 𝑝𝑖 − 𝑖=0 𝑑𝑖 𝜙𝑑 𝑑𝑥𝑖 , 𝑑𝑦𝑖 + 𝑏 Full score 𝑖=1 𝐹0 𝜙 𝐻, 𝑝0 > th0 Score 1 part No 1 𝑖=0 𝐹𝑖 Scale: Multi-scale sliding window 𝜙 𝐻, 𝑝𝑖 − 𝑑𝑖 𝜙𝑑 𝑑𝑥𝑖 , 𝑑𝑦𝑖 > th1 No Score 2 parts Results: DPM face detection True Positive Rate Dataset: AFLW Proposed Zhu&Ramanan Multiview V&J False Positive Rate Advantages over Zhu & Ramanan: • • • • Only face bound annotations needed Better for lower resolution 5 parts instead of 66 Cascade detection Overview Multi-view face detection Facial Landmarking Facial Action Unit Detection [Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face Detection” 2010CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph Models” [81 citations] 2013TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point Detection” [IF 2012: 4.80, Q1] [Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance” [IF 2012: 1.23, Q3] 2014TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling” [Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey” [Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for Facial Action Recognition” Part-based facial landmarking Classical part-based: Construct response maps Train: 1 classifier per point (e.g. logistic classifier) Test: Construct response map (sliding window over ROI) Do regression! Maximise response constrained to feasible shape (constrained gradient ascent ) CVPR 2010 – Facial Point Detection using Boosted Regression and Graph Models Constrained gradient ascent Regression for Localisation 𝐿 Current estimate BoRMaN algorithm Face Detection Δx Δy Obtain prior location 𝑇 Ground truth (starting point) Eval. regressors (new location hypotheses) Regression: 𝑓: ℝ𝑛 ⟶ ℝ HOG w hile it it 0 Correct hypothesis 𝑥 𝑇 = 𝐿 + 𝑓𝛥𝑥 𝑥 , 𝑓𝛥𝑦 𝑥 =𝐿 + 𝛥𝑥 , 𝛥𝑦 Multiple Regression Methodologies: Least Squares, SVR, GP, random forests… (shape restrictions) Output MRF-based shape model • • Detect bad estimations Propose an alternative Shape model Relations are rotation and scale independent Angle α between segments 𝛼 𝑆𝑖𝑗 𝑆∗∗ 𝑆𝑘𝑙 𝑆∗∗ 𝛼 𝑆𝑖𝑗 , 𝑆𝑘𝑙 𝜌 𝑆𝑖𝑗 , 𝑆𝑘𝑙 Ratio ρ between segment lengths Regression-based landmarking Major improvements: Established a trend: Best performing nowadays! Prediction accumulation/voting Facial landmarking using regression: 2010: CVPR 2012: CVPR (Microsoft Res.) CVPR (ETH, Van Gool) ECCV (Manchester Univ.– Cootes) 2013: TPAMI (iBug) CVPR (CMU) CVPR (iBug) ICCV (Microsoft Res.) ICCV (QMUL) 2013 TPAMI – Martinez, Valstar, Binefa, Pantic Cascaded regression … Regression: Vote aggregation What if we are too far from the target? What if we have bad predictions? Errors ≈Uniformly distributed do NOT accumulate Errors ≈Gaussian distributed DO accumulate Base of the algorithm: Accumulate predictions, a prediction being a small Gaussian LEAR algorithm Overview Multi-view face detection Facial Landmarking Facial Action Unit Detection [Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face Detection” 2010CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph Models” 2013TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point Detection” [Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance” 2014TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling” [IF 2012: 3.24, Q1] [Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey” [IF 2012: 3.62, Q1] [Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for Facial Action Recognition” Action Unit detection – what is it about? Facial expression recognition Message judgment: Directly decode the meaning of the expression • 6 universal expressions: happiness, anger, sadness, fear, surprise, disgust (constant message to sign relation) • Pre-segmented episodes Sign judgment: Study the physical signals composing the expression • • • • • An AU relates to the activation of a facial muscle “Agnostic” (not concern about “knowing” the message) Can represent any expression Reasoning upon needed to understand Frame-based labelling Facial Action Coding System is the most common sign judgment approach. Happiness? Pain? Action Unit analysis: what and why Research problems within the field: • AU detection (per-frame) • AU intensity estimation • AU temporal segment detection • AU correlations (for structured prediction) • Semantics of AUs What do they allow (that normal facial expression analysis does not): • Pain detection • Deceit detection • Detection of social signals (conflict, agreement/disagreement,…) How Action Unit detection is done Pre-processing Feature extraction Appearance Machine Analysis SVM, ANN, Boosting… Face detection Facial landmark detection Dynamic Registration T Non-ref. affine Geometric Trans. 𝑑1 𝑑2 Graph models (label consistency) 𝑑1′ 𝑑2′ [Under Review] IJCV - Jiang, Martinez, Valstar, Pantic, “Automatic Analysis of Facial Actions: A Survey” TOP features Three orthogonal planes (TOP): Extension to spatio-temporal volumes of histogram features Markov Model over temporal segments Neut Onset Representing the face Allows: analysis of AU temporal segments Apex Offset 2014 TSMC-B - “A dynamic appearance descriptor approach to facial actions temporal modelling” Publications [Under Review] B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey”. International Journal of Computer Vision [IF 2012: 3.62, Q1] J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face Detection”, Image and Vision Computing [IF 2012: 1.96, Q1] B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance”, Computer Vision and Image Understanding [IF 2012: 1.23, Q3] B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for Facial Action Recognition”, Int. Conf. on Pattern Recognition, 2014 [Journals] 2014 B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling”, In IEEE Tans. on System Man and Cybernetics – Part B [IF 2012: 3.24, Q1] 2013 B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point Detection”, In IEEE Trans. on Pattern Analysis and Machine Intelligence [IF 2012: 4.80, Q1] 2013 S. Petridis, B. Martinez, M. Pantic, “The MAHNOB Laughter Database”, In Image and Vision Computing Journal [IF 2012: 1.96, Q1] 2011 M. Vivet, B. Martinez and X. Binefa, “DLIG: Direct Local Indirect Global Alignment for Video Mosaicing”, In IEEE Trans. on Circuits and Systems for Video Technology [IF 1.65, Q2] 2008 B. Martinez, X. Binefa, “Piecewise affine kernel tracking for non-planar targets”, In Pattern Recognition [IF: 3.28, Q1] [Conferences] 2010 M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph Models”, In IEEE Int’l Conf. on Computer Vision and Pattern Recognition [27% acceptance rate, 81 citations] 2010 B. Martinez, X. Binefa, M. Pantic, “Facial Component Detection in Thermal Imagery”, In IEEE Int'l Conf. Computer Vision and Pattern Recognition - Workshops Thanks!