UMR 5205 On the Efficiency of Image Metrics for Evaluating the Visual Quality of 3D Models Guillaume Lavoué Université de Lyon LIRIS Mohamed Chaker Larabi Université de poitier XLIM-SIC Libor Vasa University of West Bohemia An illustration Original Watermarking Wang et al. 2011 Smoothing Taubin, 2000 0.14 Watermarking Cho et al. 2006 0.51 Simplification Lindstrom, Turk 2000 0.40 Noise addition 0.62 Same Max Root Mean Square Error (1.05 × 10-3) 0.84 Quality metrics for static meshes Local curvature statistics MSDM [Lavoué et al. 2006] MSDM2 [Lavoué 2011] [Torkhani et al. 2012] Distorted model Local differences of statistics Matching Original model Local Distortion Map Spatial pooling Global Distortion Score 3 Our previous works Distortion score Why not using Image Quality Metrics? Such image-based approach has been already used for driving simplification [Lindstrom, Turk, 2000][Qu, Meyer, 2008] 4 Our study Determine the best set of parameters to use for such imagebased quality assessment approach. Compare this approach to the most performing model-based metrics. 5 Many parameters Which 2D metric to use? How many views, which views? How to combine the 2D scores? Which rendering, lighting? In our study, we consider: o 6 image metrics o 2 rendering algorithms Around 100,000 images o 9 lighting conditions o 5 ways of combining image metric results o 4 databases to evaluate the results 6 Image Quality Metrics Simple PSNR and Root Mean Square Error MSSIM (multi-scale SSIM) [Wang et al. 2003] VIF (visual information fidelity) [Sheikh and Bovik, 2006] IWSSIM (information content weighted SSIM) [Wang and LI, 2011] FSIM (feature similarity index) [Zhang et al. 2011] State of the art algorithms 7 Generation of 2D views and lightning conditions 42 cameras placed uniformly around the object Rendering using a single white directional light source The light is either fixed with respect to the camera, or with respect to the object 3 positions: front, top, top-right So we have 3*2 = 6 lighting conditions We also consider averages of object-light, camera-light and global 9 conditions 8 Image Rendering Protocols We consider 2 ways of computing the normals, with or without averaging on the neighborhood. 9 Pooling algorithms How to combine the per-image quality score into a single one? Minkowski norm is popular: We also consider image importance weights [Secord et al. 2011] Perceptual model of viewpoint preference Surface visibility 10 The MOS databases The LIRIS/EPFL General-Purpose Database 88 models (from 40K to 50K vertices) from 4 reference objects. Non uniform noise addition and smoothing. The LIRIS Masking Database 26 models (from 9K to 40K vertices) from 4 reference objects. Noise addition on smooth or rough regions. The IEETA Simplification Database 30 models (from 2K to 25K vertices) from 5 reference objects. Three simplification algorithms. The UWB Compression database 68 models from 5 reference objects Different kinds of artefacts from compression 11 Results and analysis Basically we have a full factorial experiments heavily used in statistics to study the effect of different factors on a response variable We consider 4 factors: o The metric (6 possible values) o The lighting (9 possible values) o The pooling (5 possible values) o The rendering (2 possible values). 540 possible combinations We consider two response variables: o Sperman correlation over all the objects o Sperman correlation averaged per objects 12 Results and analysis For a given factor associated with n possible values, we have n sets of paired spearman coefficients. To estimate the effect of a given factor on the objective metric performance, we conduct pairwise comparisons of each of its value between the others (i.e. n(n-1)/2 comparisons). We have paired values, so we can do better than a simple comparison of the means. Statistical significance test (not Student but Wilcoxon signed rank test). We study the median of paired differences, as well as the 25th and 75th percentiles. 13 Influence of the metrics IWSSIM provides the best results FSIM and MSSIM are 2nd best, significantlky better than MSE and PSNR. VIF provides instable results (see the percentiles). 14 Influence of the lighting Indirect illuminations provide better results Light has to be linked to the camera Object-front is not so bad, but not its performances are not stable. 15 Influence of the pooling Low values of P are better. Weights do not bring significant improvments. 16 Comparisons with 3D metrics For easy scenarios: 2D metrics are excellent However when the task becomes more difficult, 3D metrics are better But, still, simple image-based metrics are better than simple geometric ones. 17