Parallel Poisson Disk Sampling Li-Yi Wei Microsoft Parallelism Processors are becoming parallel Intel Larrabee, NVIDIA, AMD/ATI, IBM/Sony Cell, etc. So are programming interfaces BSGP, CUDA, CAL, Ct, DX, OpenGL, etc. As well as applications To take advantage of parallel environment Parallelization Traditional parallelization methods Sequential consistency [Lampert] - sorting, FFT, matrix, etc. Not all algorithms need to be seq-consistent Graphics, computer vision, image/video, statistics Approximate solutions might suffice Opportunities for new parallelization methods First pick: Poisson disk sampling A set of samples that are as random as possible remain a minimum distance r away from each other Why pick this problem? important algorithm seemly non-parallelizable Importance of Poisson disk sampling Best quality for N samples [Cook 1986] Natural object distribution (retina cells, ecology) Blue noise spectrum void in low freq noise in high freq Applications in Rendering, imaging, geometry processing, etc. Optimal spectrum (given # samples) All with 1600 samples spectrum samples Blue noise: aliasing → noise regular grid jittered grid Poisson disk Spatial sampling sin( x 2 y 2 ) (zone plate) aliasing noisy regular grid jittered grid Poisson disk Methods Dart throwing [Cook 1986] Loop: Random sample from the entire domain Accept sample if not in conflict with existing ones O High quality Ground truth X Slow speed Inherently sequential Speed improvement Computation on the fly (sequential) Scalloped regions [Dunbar & Humphreys 2006] Onion layers [Bridson 2007] Hierarchical dart throwing [White et al. 2007] Pre-computed data set (parallel access) Penrose tiling [Ostromoukhov et al 2004] Wang tiles [Cohen et al. 2003; Lagae & Dutre 2005; Kopf et al. 2006] Polyominoes [Ostromoukhov 2007] X Potential large data set + quality issue Features of our approach Parallel computation Entirely on the fly (no pre-computed data) Good spectrum quality Like dart throwing + Adaptive sampling + Any dimension Parallel GPU run time (in slow motion) Multi-resolution synthesis Our basic idea Samples from a grid 1 sample per grid cell Sample grid cells far apart in parallel Watch out for bias! Tricks to avoid bias Algorithm in gradual steps Uniform sampling, sequential Uniform sampling, parallel Adaptive sampling Sequential sampling Basic data structure Choose grid cell size dd so that each cell has at most one sample r d r = minimum spacing n = dimension Inspired by [Bridson 2007] Texture synthesis 2 r Sequential sampling scan-line order + single resolution Bias! Scanline order Grid sampling Sequential sampling random order + single resolution Removes scanline bias But still grid-cell biased scanline random Sequential sampling random order + multi-resolution Removes both biases random scanline scanline, grid 1 level 3 level 5 level Sequential sampling Summary for bias removal Two sources of bias Grid sampling fixed by multi-resolution random scanline Traversal order fixed by random order 1 level 3 level 5 level Parallel sampling Key insight Sample cells sufficiently far away in parallel 2D example: r Cells 2d apart cannot conflict with each other split cells → phase groups d r 2 Phase group partition grid partition random order random partition grid partition scanline order 6 7 8 6 7 8 3 2 8 4 2 7 1 3 2 1 3 2 3 4 5 3 4 5 4 6 1 3 6 1 8 4 6 8 4 6 0 1 2 0 1 2 9 5 0 9 5 0 5 0 7 5 0 7 6 7 8 6 7 8 3 2 8 4 2 7 1 3 2 1 3 2 3 4 5 3 4 5 4 6 1 3 6 1 8 4 6 8 4 6 0 1 2 0 1 2 5 0 7 5 0 8 5 0 7 5 0 7 O easy to compute X bias! (scanline) O good quality X hard to compute (sequential) O easy to compute O good quality Parallel sampling Summary for each level low to high for each phase group p parallel: for each cell in p if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones Adaptive sampling Slightly more involved than uniform sampling Parallelizable as well Results our method dart throwing Spectrum comparison - 2D power spectrum (10 run) radial mean radial variance Sampling in higher dimensions Algorithm applicable to 2+ dimension power spectrum 3D samples radial mean radial variance Performance O: on the fly P: pre-computed dataset # samples per second O Our method (NVIDIA 8800 GTX) O Boundary sampling [Dunbar & Humphreys 2006] O Hierarchical dart throwing [White et al. 2007] P Wang tiling [Kopf et al. 2006] P Polyominoes [Ostromoukhov 2007] 2D 3D 4D 5D 6D 4.06 M 555 K 42.9 K 2.43 K 179 0.20 M X X X X 0.21 M X X X X 1~3M X X X X >1 M X X X X Wang tiling Corner tiling P-pentominoes Our method [Kopf et al. 2006] [Lagae & Dutre 2006] [Ostromoukhov 2007] Limitations Only empirical, but no theoretical proof yet Slow in high dimensions, adaptive sampling Hard to control exact # of samples No fine-grain sample ranking e.g. progressive zoom-in [Kopf et al. 2006] Euclidean space only (no manifold surface) Future work for parallel algorithm Sequential consistency [Lampert] too strict for some applications A looser sense of consistency? parallel texture synthesis [Lefebvre & Hoppe 2005] random number generation [Tzeng & Wei 2008] Acknowledgements Ares Lagae Stanley Tzeng Johannes Kopf Eric Stollnitz Victor Ostromoukhov Brandon Lloyd Eric Andres Dwight Daniels Zhouchen Lin Jianwei Han Ting Zhang Baining Guo Kun Zhou Harry Shum Xin Tong Reviewers Jian Sun