Modeling 3D Deformable and Articulated Shapes Yu Chen, Tae-Kyun Kim, Roberto Cipolla Department of Engineering University of Cambridge Roadmap Brief Introductions Our Framework Experimental Results Summary Motivation Tasks: – To recover deformable shapes from a single image with arbitrary camera viewpoint. 3D Shapes + 2D Images Uncertainty Measurements Previous Work Rigid shapes [Prasad’05, Rother’09, Yu’09, etc.] Problems: – Cannot handle self-deformation or articulations. Category-specific articulated shapes e.g., human bodies [Anguelov’05, Balan’07, etc.] Problems: – Requiring strong shape or anatomical knowledge of the category, such as skeletons and joint angles. – Too many parameters to estimate; – Hard to be generalised to other object categories. Roadmap Brief Introductions Our Framework Experimental Results Summary Our Contribution A probabilistic framework for: – Modelling different shape variations of general categories; – Synthesizing new shapes of the category from limited training data; – Inferring dense 3D shapes of deformable or articulated objects from a single silhouette; Explanations on the Graphical Model Pose Generator Shape Generator Shape Synthesis Joint Distribution: Matching Silhouettes Generating Shapes Target: Simultaneous modelling two types of shape variations: – Phenotype variation: fat vs. thin, tall vs. Short... – Pose variation: articulation, self deformation, ... Training two GPLVMs: – Shape generator (MS) for phenotype variation; – Pose generator (MA) for pose variation. Generating Shapes Shape Generator (MS) – Training Set: Shapes in the canonical pose. – Pre-processing: Automatically register each instance with a common 3D template; 3D shape context matching and thin-plate spline interpolation; Perform PCA on all registered 3D shapes. – Input: PCA coefficients of all the data. Generating Shapes Pose Generator (MA) – Training Set: Synthetic 3D poses sequences. – Pre-processing: Perform PCA on both spatial positions of vertices and all vertex-wise Jacobian matrices. – Input: PCA coefficients of all the data Shape Synthesis VA Pose Generator MA VA Shape Synthesis V V Zero Shape V0 VS VS Shape Generator MS Shape Synthesis Modelling the local shape transfer – Computing Jacobian matrices on the zero shape vertex-wisely. Ji Shape Synthesis Synthesizing fully-varied shape V from phenotype-varied shape VS and posevaried shape VA. Probabilistic formulation: a Gaussian Approximation Matching Silhouettes A two-stage process: o Projecting the 3D shape onto the image plane o Chamfer matching of silhouettes Maximizing likelihood over latent coordinates xA, xS and camera parameters γk o o Optimizing the closed-form lower bound. Adaptive line-search with multiple initialisations. Roadmap Brief Introductions Our Framework Experimental Results Summary Experiments on Shape Synthesis Task: – To synthesize shapes in different phenotypes and poses with the mean shape μV. Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Shape Synthesis: Demo Shape Generator Pose Generator (Running) Experiments on Single View Reconstruction Training dataset: – Shark data: MS: 11 3D models of different shark species . MA: 11-frame tail-waving sequence from an animatable 3D MEX model. – Human data: MS: CAESAR dataset. MA: Animations of different 3D poses of Sydney in Poser 7. Testing: – Internet images (22 sharks and 20 humans in different poses and camera viewpoints) Segmentation: GrabCut [Rother’04] Experiments on Single View Reconstruction Sharks: Experiments on Single View Reconstruction Humans: Experiments on Single View Reconstruction Examples of multi-modality Experiments on Single View Reconstruction Qualitative Results: Precision-Recall Ratios – SF: foreground regions – SR: image projection of our result A very good approximation to the results given by parametrical models Roadmap Brief Introductions Our Framework Experimental Results Summary Pros and Cons: Advantages Fully data driven; Requiring no strong classspecific prior knowledge, e.g., skeleton, joint angles; Capable of modelling general categories; Compact shape representation and much lower dimensions for efficient optimization; Uncertainty measurements provided. Disadvantages Inaccurate at fine parts, e.g., hands. Lower descriptive power on poses compared with parametric model, when training instances are not enough; Training data are sometimes difficult to obtain. Future Work A compatible framework which allows incorporating category knowledge Incorporating more cues: internal edges, texture, and colour; Multiple view settings and video sequences; 3D object recognition and action recognition tasks. Thanks!