Yunhai Wang1 Minglun Gong1,2 Tianhua Wang1,3 Hao (Richard) Zhang 4 Daniel Cohen-Or 5 Baoquan Chen1,6 1Shenzhen 2 Memorial 4 Simon 5 Institutes of Advanced Technology Fraser University University of Newfoundland 3Jilin Tel-Aviv University 6 Shandong University University One of the most fundamental tasks in shape analysis Low-level cues (minimal rule; convexity) alone insufficient 2/40 Learning segmentation Unsupervised co-analysis [Kalograkis et al. 10] [Sidi et al. 2011] Keys to success: amount & quality of labelled or unlabelled 3D data 3 Joint segmentation Active co-analysis [Huang et al. 2011] [Wang et al. 2012] 3/40 How many 3D models of strollers, golf carts, gazebos, …? Not enough 3D models = insufficient knowledge Labeling 3D shapes is also a non-trivial task 380 labeled meshes over 19 object categories 4/40 About 14 million images across almost 22,000 object categories Labeling images is quite a bit easier than labeling 3D shapes 5/40 Self-intersecting; non-manifold Incomplete Real-world 3D models (e.g., those from Tremble Warehouse) are often imperfect 6 6/40 Treat a 3D shape as a set of projected binary images Label these images by learning from vast amount of image data Then propagate the image labels to the 3D shape Alleviate various data artifacts in 3D, e.g., selfintersections 7/40 Joint image-shape analysis via projective analysis for semantic 3D segmentation Utilize vast amount of available image data Allowing us to analyze imperfect 3D shapes 8/40 Bi-class Symmetric Hausdorff distance = BiSH Designed for matching 1D binary images More sensitive to topology changes (holes) Caters to our needs: part-aware label transfer 9/40 Many works on 2D-3D fusion, e.g., for reconstruction [Li et al.11] Image-guided 3D modeling [Xu et al.11] 10 10/40 Image-space simplification error [Lindstrom and Turk 10] Light field descriptor for 3D shape retrieval [Chen et al.03] We deal with the higher-level and more delicate task of semantic 3D segmentation 11/40 11 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion 12/40 Labeling involves GrabCut and some user assistance 13/40 Assume all objects are upright oriented; they mostly are! Project an input 3D shape from multiple pre-set viewpoints 14/40 For each projection of the input 3D shape, retrieve top matches from the set of labelled images 15/40 Select top (non-adjacent) projections with the smallest average matching costs for label transfer 16/40 Label transfer is done per corresponding horizontal slabs Later … Pixel correspondence straightforward 17/40 Label transfer is weighted by a confidence value per pixel Three terms based on image-level, slab-level, and pixel-level similarity: more similar = higher confidence 18/40 Probabilistic map over input 3D shape: computed by integrating per-pixel confidence values over each shape primitive One primitive projects to multiple pixels in multiple images Per-pixel confidence gathered over multiple retrieved images 19/40 Final labeling of 3D shape: multi-label alpha expansion graph cuts based on the probabilistic map 20/40 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion 21/40 … Projections of input 3D shape … Database of (labeled) images Goal: find shapes most suitable for label transfer and FAST! Not a global visual similarity based retrieval Want part-aware label transfer but cannot reliably segment Characteristics of the data to be matched Classical descriptors, e.g., shape context, interior distance shape context (IDSC), GIST, Zenikenot moments, Possibly complex topology (lots of holes), just a Fourier descriptors, etc., do not quite fulfill our needs contour 22/40 All upright orientated: to be exploited Takes advantage of upright orientation 23/40 Cluster scan-lines into smaller number of slabs --efficiency! Hierarchical clustering by a distance between adjacent slabs Classical choice for distance: symmetric Hausdorff (SH) But not sensitive to topology changes; not part-aware 24/40 C SH(C,B)=2, SH(Cc, Bc)=2 B A SH(A,B)=2, SH(Ac, Bc)=10 B SH for only one class may not be topologysensitive A bi-class SH distance is! 25/40 C B SH(C,B)=2, SH(Cc, Bc)=2 BiSH(C,B) = 2 A B SH(A,B)=2, SH(Ac, Bc)=10 BiSH(A,B) = 10 26/40 BiSH is more part-aware: new slabs near part boundaries BiSH SH 27/40 Slabs are scaled/warped vertically for better alignment Another measure to encourage part-aware label transfer Warp Slabs of labeled image warped to better align with slabs in projected image Recolor Slabs recolored: many-to-one slab matching possible 28/40 Dissimilarity between slabs: BiSH scaled by slab height Slab matching allows linear warp: optimized by a dynamic time warping (DTW) algorithm Dissimilarity between images: sum over slab dissimilarity after warped slab matching 29/40 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion 30/40 Same inputs, training data (we project), and experimental setting Models in [K 2010]: manifold, complete, no selfintersections PSA allows us to handle any category and imperfect 31/40 11 object categories; about 2600 labeled images All input 3D shapes tested have self-intersections as well as other data artifacts 32/40 Pavilion (465 pieces) Bicycle (704 pieces) 33/40 34/40 Matching two images (512 x 512) takes 0.06 seconds Label transfer (2D-to-2D then to 3D): about 1 minute for a 20K-triangle mesh Number of selected projections: 5 – 10 Number of retrieved images per projection: 2 35/40 Projective shape analysis (PSA): semantic 3D segmentation by learning from labeled 2D images Demonstrated potential in labeling 3D models: imperfect, complex topology, over any category 36/40 36 Utilize the rich availability and ease of processing of photos for 3D shape analysis No strong requirements on quality of 3D model 37/40 Inherent limitation of 2D projections: they do not fully capture 3D info Inherent to data-driven: knowledge has to be in data Relying on spatial and not feature-space analysis Assuming upright; not designed for articulated shapes 38/40 Labeling 2D images is still tedious: unsupervised projective analysis Additional cues from images and projections, e.g., color, depth, etc. Apply PSA for other knowledge-driven analyses 39/40 More results and data can be found from http://web.siat.ac.cn/~yunhai/psa.html 40/40 40