3D Face Modeling and Animation CMSC 3D Character Design & Animation Contents Motivation 3D facial geometry modeling 3D facial deformation modeling 3D facial animation The iFace system Speech-driven talking heads Text-driven talking heads A glimpse at MP5 Motivation Avatar-based HumanComputer Interaction Animation Teleconference Recognition: Face recognition, soft biometrics 3D Facial Geometry Modeling Polygonal mesh – Vertices – Triangles, quadrangles, etc. – Normals – Texture Each mesh vertex is a sample point of the human facial surface How do we acquire the positions of these sample points? 3D Facial Geometry Acquisition Artist’s designs 3D scanners (active) 3D reconstruction from 2D image(s) (passive) Laser Scans Active Acquisition Time of flight – Examples: DepthSense , PMD Structured light – Example: Kinect Passive Stereo Photogrammetry Photo courtesy: Dimensional imaging Beeler - Siggraph 2010 http://www.youtube.com/watch?v=JX5stsU6xfE 3D Face Reconstruction Framework Neutral Frontal Face P( s3 D , t x , t y , f , , , | S 2 D ) Texture 2D Alignment MPG4 FAT Pose Models Classifier ?? New Face Illumination Expression 3D Facial Deformation Modeling Free-form deformation models Muscle-based deformation models Free-form Deformation Model The coordinates of the mesh vertices can be deformed in a free-form manner by changing the positions of some control points Control points can either belong to the mesh vertices or not Example: Piecewise Bezier Volume Deformation Model (Tao and Huang, 1998) Muscle-based Deformation Model Muscles of the face Muscle-based Deformation Model Build simplified mathematical model that simulates muscle actions on the facial skin Linear Muscle Models Muscle Based Animation Uses a mass-and-spring model to simulate facial muscles. Muscles are of two types: linear muscles that pull and elliptic muscles that squeeze. Muscle parameters: muscle vector and zone of muscle effect. Modeling the Primary Facial Expressions Basic facial expressions that are considered to be generic to the human face: Happiness, Anger, Fear, Surprise, Disgust and Sadness. Synthesized Facial Expressions Waters SIGGRAPH ‘87 Neutral face Anger Happiness Surprise Fear Disgust Facial Action Coding System (FACS) The system was developed by Ekman and Friesen, in 1978 FACS describes facial deformations in terms of “Action Units” (AUs) Some of the AUs correspond directly to actions of facial muscles; others involve things like the movement of the tongue or air filling the cheeks AUs may be combined to describe any facial expressions Facial Action Coding System (FACS) MPEG4 Facial Animation Parameters (FAPs) MPEG4 defines 68 FAPs, categorized into 10 groups Motion Units Learn the basic facial deformations from motion capture data (Hong, Wen, and Huang, 2001) The Anatomical Model The face can be modeled by two layers and three surfaces Dermal-fatty Layer Muscle Layer Epidermal surface Fascia Surface Skull Surface The Volume Preservation Forces The human skin is incompressible Volume preservation force is needed to simulate the wrinkles Pressing the node upwards proportionally to the decrement of the volume Geometry models for other head components Teeth, eyes, and neck are modeled separately These data are difficult to be captured by the scanner Muscle-based Animation Estimating the muscle activation from the motion capture data First, a precise anatomical model is Produced from Visible Human Motion Dataset Next, the muscles are activated so that the simulated location of the marker overlaps with its real location 3D Facial Animation Key-frame interpolation method – Place particular facial deformations at particular time instants (key-frames) – Facial deformations in-between key-frames are obtained by a certain interpolation scheme Two basic key-frame types – Visemes – Expressions Visemes Representational unit used to classify speech sounds in the visual domain Was introduced based on the interpretation of the phoneme as a basic unit of speech in the acoustic/auditory domain But … Viseme Phoneme Visemes Describes the particular facial and oral positions and movements that occur alongside the voicing of phonemes The analogous term for the acoustic reflection of a phoneme would be "audieme", but this is not in use Visemes Phonemes and visemes do not always share a one-to-one correspondence Often, several phonemes share the same viseme Visemes Conversely, some sounds which are hard to distinguish acoustically are clearly distinguished by the face For example, acoustically speaking English /l/ and /r/ could be quite similar (especially in clusters, such as 'grass' vs. 'glass') Yet visual information can show a clear contrast This is demonstrated by the more frequent mishearing of words on the telephone than in person Visemes Visemes Facial expressions Speech-driven talking heads Text-driven talking heads