Hallucinating Faces: Hallucinating Hallucinating Faces: TensorPatch Super‐ Super‐Resolution and Coupled Residue Compensation CVPR 2005 Wei Liu, Dahua Lin, and Wei Liu, Dahua Lin, and Xiaoou Xiaoou Tang Dept. of Information Engineering The Chinese University of Hong Kong The Chinese University of Hong Kong Outline What is Face Hallucination Related Works Our Framework Two-Stage Architecture: Inference and TwoCompensation TensorPatches Coupled Residue Compensation Experiment Results Definition e to Super--Resolution (SR) Super Any Images: Low Resolution Æ High Resolution Face Hallucination: SR applied to faces Also called hallucinating faces, face super super--resolution (face SR) Meet additional constraints z Sanity constraint Close to input image when downClose down-sampled. z Global constraint Have Common Properties of a face, face e.g. e g eye, eye mouth and nose, nose symmetry, etc. z Local constraint Have specific characteristics of the face image with photorealistic local features. SR Examples Input p low-resolution Original g high-resolution Hallucinated high-resolution Approaches I t InterpolationInterpolation l ti -Based B dA Approaches h Bilinear/Bicubic/B-Spline Interpolation Bilinear/Bicubic/BDo not use prior information Incur serious blurring Fail to recover the details of image Generic method Learning--Based Approaches Learning Learn the prior information from training samples Capable Capab eo of restoring esto g image age deta details s Higher quality Can be tailored to specific domain: such as face images Learning--Based Hallucination Learning Training Samples T i i Training Hallucination Model Prior Information Low Resolution Image (Less information) Smoothing S thin & Down--sampling Down Hallucinating High Resolution Image (More information) Generic Image SR R Representative t ti W Works k W. Freeman et al, “Learning Low Low--Level Vision”, IJCV 2000. 2000 J. Sun, N. Zheng, H. Tao, and H. Shum, “Image Hallucination with Primal Sketch Priors”,, CVPR 2003. Limitations Require a large patch database to make it applicable to a wide range of images Complicated Statistical formulation (Markov Network) with ith expensive i optimization ti i ti procedure d (B (Belief li f propagation) Domain--Specific Domain For SR in a certain domain, domaindomain-specific methods are preferable: Capture the domaindomain-specific priors effectively. Require much smaller training set. Computationally efficient methods are feasible feasible. Higher quality can be achieved with models tailored to the domain domain. Face hallucination is just domaindomain-specific. Representative Related Works S Baker S. B k and d T. T Kanade, K d “Hallucinating “H ll i ti F Faces”, ” FG 2000. The pioneering work in face hallucination. hallucination A framework based on Bayesian MAP formulation Conditional probability term: observation model with GaussianGaussiannoisenoise i -assumption, ti Priori term: gradient prior prediction. Use gradient descent to optimize the objective function. Limitations The gradient pyramidpyramid-based prediction, as a heuristic method, can nott model d l th the priors i well. ll Pixels are predicted individually may may cause discontinuity and noise Gradient descent optimization is required. Representative Related Works C Li C. Liu, H H. Sh Shum and dC C. Zh Zhang, “A T Two-step Twot Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model”, CVPR 2001. Two stage framework (Global and Local) based on a unified Bayesian formulation Global Model: a linear Parametric Inference Local Model: patchpatch-based nonparametric Markov network Li it ti Limitations The linear global model with Gaussian assumption tends to over over-simplify the problem. Markov Network involves an timetime-consuming optimization by belief propagation. Our Approach Targets Recover the details with high fidelity. Preserve the continuity and smoothness of the whole image. Adapt to diff different t personalities liti different statistical characteristics of different locations on an image. Hi h Effi High Efficiency i and dR Robustness. b t Basic Framework: TwoTwo-Stage g Architecture TensorPatch Inference + Residue Compensation Patch--Based Patch Each image is divided into overlapping patches. Learning g or Inferring g are both based on p patches. Why ? Different components of faces take on different statistical characteristics. Local models work better to restore local details. The overlapping enhances the interinter-patch continuity. Lower--dimensional space makes learning and Lower inferring more robust and efficient. Our Framework Input Low-Resolution Patch TensorPatch Inference Residue Compensation Initial Result + Down-sampled Version Construct Low-Resolution Low Resolution Residue High-Resolution Residue Final Result Coupled PCA Low-Resolution R id Residue TensorPatches Basic Motivations Theoretical Foundation: Multilinear Algebra Reconstruction--based Analysis Reconstruction TensorPatches SR Algorithm Motivations of TensorPatches Individual patch appearance reflects the compound effect of diverse factors. The two key ingredients determine what a patch looks like P Personality, li Patch--Location. Patch Th two The t factors f t interact i t t with ith each h other th in i a complex way We W need d tto model d l th the iinteraction, t ti Multilinear (tensor) algebra. Multilinear Analysis Why we use Multilinear Analysis? It unifies multiple factors in an framework. It explicitly models the interaction between factors. Theoretical Foundation – Tensor Algebra g I × I ×"× I n Tensor – Multidimensional Array A ∈ R 1 2 Tensor Product Ik (A ×k U)i1i2 "ik −1 jk ik +1"in = ∑ (A )i1i2 "ik −1ik ik +1"in (U) jk ik ik =1 High Order Singular Value Decomposition (HOSVD) A = C ×1 U1 ×2 U 2 ×3 " ×n U n How Multilinear Analysis Works Ensemble Representation Ensemble Tensor: Arrange the samples by factors Prototype Matrix: The basiss he bas spanning the space of all major variations D = C ×1 U1 ×2 U 2 ×3 " ×n −1 U n −1 ×n U n Core Tensor: Controlling the interaction between factors Mode matrices: Capture the variation patterns of each factor How Multilinear Analysis Works Individual Sample Representation x = C ×1 u ×2 u ×3 " ×n −1 u T 1 T 2 Core Tensor: Coordinate the interaction between factors T n −1 ×n U n The vector representation s of factors Obtain the coefficients of basis Combines the prototype vectors with the coefficients Formulation of TensorPatches F Formulation l ti off Patches P t h Ensemble E bl D = C ×1 U person ×2 U location ×3 U pixels Sample ensemble Core tensor coordinating the interaction between factors Vector representations encoding person-related related information Vector representations encoding location-related related information The training can be done by HOSVD HOSVD. The basis span the whole variation subspace p of patches Patch Synthesis B i P Basic Procedure d person factor l) v (person Input Low-Resolution Patch (l ) v location w1 w2 l) v (person Output High-Resolution High Resolution Patch (l ) v location location factor Local Model in Sample space: Each patch should be reconstructed by near samples. v person = U v location = U person w1 location w 2 Illumination of Patch Synthesis Local Structure in Person Factor Space Patch Vector Low Resol tion Resolution Analysis Synthesis Local Structure in Location Factor Space Patch Vector High Resol tion Resolution Mathematics of Patch Synthesis x (l ) = C ×1 v (l ) ( l )T person = (C ×3 U (l ) (l ) pixels = S ×1 w U (l ) T 1 = ( S ×1 U (l ) x x (l ) (h) ×2 v ) ×1 v (l ) person (l ) person ( l )T location ×3 U ( l )T person ×2 v ×2 w U ×2 U T T (l ) = T ×1 w1 ×2 w 2 (h) T T =T ×1 w1 ×2 w 2 (l ) pixels T 2 (l ) l location i ( l )T location (l ) location ) ×1 w ×2 w T 1 T 2 The weights reflects the local structure shared by both patch spaces, which is the base for inference Coupled Residue Compensation Basic Motivations Coupled PCA Basic Motivations After TensorPatches, Aft T P t h there th are still till some highhigh hi hfrequency components are not modeled. Observe that there are some differences between the reconstructed and original g LR patches. These differences correspond to some highhigh-freq. freq components, it can thus be utilized to enhance the restoration of high high--freq. components in the target HR image. Coupled PCA Coupled Coup ed PCA C R kC Rank Constraint t i t Hidden Vector (h) Source Vector (x) x = BX h Enhance robustness by reducing the interference of irrelevant information y = B yh y = B y BTx x dh < d x dh < d y rank(BY BTX ) = d h Target Vector (y) E Experiment i t Results R lt Conclusions TensorPatches: a theoretically wellwell-founded, robust and efficient multilinear model for inference. Coupled Residue Compensation: enhance the quality in a robust way. Comparative Experiments: encouraging results di l i th displaying the effectiveness ff ti off our fframework. k Future Plan High--Zoom Face Hallucination High z z z Aerial Image Hallucination z z Global Linear Model L Local l Multilinear M l ili M d l TensorPatches Model: T P h Unifying models Learn DomainDomain-Specific p Priors Level--Set Level Video SuperSuper-Resolution z z Manifold Correspondence Conditional Random Fields (CRFs) References [1] W. Freeman, E. Pasztor Pasztor,, and O. Carmichael, “Learning LowLow-Level Vision”, IJCV, 2000. [2] S. Baker and T. Kanade Kanade,, “Hallucinating Faces”, in Proc. FG FG,, 2000. [3] S. Baker and T. Kanade Kanade,, “Limits on Super Super--Resolution and How to Break Them”, PAMI PAMI,, 2002. [4] C. Liu, H. Shum, and C. Zhang, “A A Two Two--step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model”, in Proc. CVPR, 2001. [5] J. J Sun Sun, N N. Zheng Zheng,, H. H Tao Tao, and H. H Harry, Harry “Image Image Hallucination with Primal Sketch Priors”, in Proc. CVPR, 2003. Th k ! Thanks! June 2005 If any question on this paper, please freely contact me ( wliu5@ie.cuhk.edu.hk). Vi iti http://mmlab.ie.cuhk.edu.hk/~face/ Visiting htt // l bi hk d hk/ f / for f more information about my other works.