Hardware-Accelerated Adaptive EWA Volume Splatting Wei Chen Liu Ren Matthias Zwicker Hanspeter Pfister ZJU CMU MIT MERL Volume Splatting Object-order method 3D reconstruction kernel centered at each voxel (elliptical Gaussian) Voxel contribution = 2D footprint (color, opacity) Weighted footprints accumulated into image 2D footprints = splats Voxel kernels Screen 2 Related Work Quality Westover1989 Crawfis 1993 EWA Swan 1997 Mueller 1999 Imagealigned Huang 2000 Our work Zwicker 2001 Swan Xue 2003 Axis- aligned Software Texture splats Fast splats OpenGL ex Speed 3 Outline EWA volume splatting Adaptive EWA splatting GPU implementation Results and conclusions 4 EWA Volume Splatting Compensate aliasing artifacts due to perspective projection EWA Filter = low-pass filter warped reconstruction filter Low-Pass Filter Projection W Convolution r1 xk r0 Volume EWA volume resampling filter k 5 EWA Volume Splatting (512x512x3) Reconstruction filter only: 6.25 fps Low-pass filter only: 6.14 fps EWA filter: 4.97 fps EWA filter: 3.79 fps 6 Analysis of EWA Filter Warped reconstruction kernel Low-pass filter Resampling filter Minification Magnification 7 Analysis of EWA Filter Shape of EWA Splat is dependent on distance from the view plane r1 x r0 EWA splat rk 2 r0 r h 2 u2 rk2 1 x02 x12 2 r1 r h 2 u2 rh Low-pass filter radius rk Reconstruction filter radius u2 Distance to the view plane 2 2 Note that 1 x0 x1 1.0 , 1.0 Cons tan t 8 Adaptive EWA Filtering Warped reconstruction kernel Low-pass filter Resampling filter if u2 > A use low-pass filter if A<u2< B use EWA filter if u2< B use reconstruction filter 9 Patch Processing Process a 8 x 8 patch of voxels at a time Filter selection based on four corners of each patch (choose smallest) Traversal order Patch Distance 10 Adaptive EWA Volume Splatting (512x512x3) Adaptive EWA filter: 6.88 fps Adaptive EWA filter: 1.84 fps EWA filter: 4.83 fps EWA filter: 1.75 fps 11 Outline EWA volume splatting Adaptive EWA splatting GPU implementation Results and conclusions 12 Object-Space EWA Splatting Object-space EWA splatting with texture mapping [Ren et al. Eurographics 2002] EWA Splat (elliptical Gaussian) Texture (unit Gaussian) (0,1) (0,0) (1,1) (1,0) Projection Unit quad Textured quad 13 Proxy Geometry Template Rectilinear volumes: use one proxy geometry template for all slices in each direction Store vertex indices in AGP memory .. . Regularity Voxel geometry Proxy geometry template Quad geometry 14 Vertex Compression Compress each vertex to 32 bits Decompression on-the-fly in programmable hardware To store vertex information of 256x256x256 volume in video memory • Without compression 2,048 MBytes • With compression 12 MBytes Retained-mode hardware acceleration feasible 15 Retained vs. Immediate Mode Data #Total Splats #Rendered Splats Immediate Mode Retained Mode Bonsai 8388608 274866 0.53 fps 7.53 Engine 7208960 247577 1.40 fps 10.28 fps Lobster 5461344 555976 1.19 fps 10.60 fps Head 13631488 2955242 0.12 fps 2.86 fps fps Factor of ~10 improvement 16 Interactive Classification: Opacity Culling Hardware-accelerated list-based traversal For each slice • For each 32 x 32 patch of voxels (smaller indices) • Indices of proxy geometry organized into iso-value lists using bucket sort; CPU merges lists on-line • Render only iso-value lists with visible voxels 0 128 Patch 256 17 Interactive Classification: Opacity Culling Includes changes to TF every frame Data List-based opacity culling Standard opacity culling Head 2.80 fps 0.3 fps Engine 10.18 fps 0.8 fps Bonsai 7.23 fps 0.8 fps Lobster 10.30 fps 1.1 fps Factor of ~10 improvement 18 Deferred Shading Volume texture access is only possible in fragment programs* However, per fragment shading is expensive Solution: deferred shading in two passes * Newer GPUs allow texture access in vertex programs 19 Deferred Shading Pass one: 3D texture access, classification and illumination in vertex shader, render one pixel per voxel Pass two: reuse the pixel data from the first pass to shade the 2D footprint Performance gain: 5%-10% speedup Pass one Pass two Final result 20 Experiments P4 2.4 GHz ATI 9800 Pro with 256 MB RAM Direct3D 9.0b with VS 2.0 and PS 2.0 Type EWA Reconstruction Only Low-pass Only Regular 70 45 26 Rectilinear 74 45 26 Vertex shader instructions 21 Sheet-buffer Composition 0.80 fps 3.00 fps 3.45 fps Axis-aligned traversal, addition in sheet buffers, then blending front-to-back 22 UNC Head: 208x256x225 #Rendered splats: 2,955,242 2.86 fps 8.5M splats / sec 23 Bonsai: 256x256x128 #Rendered splats: 274,866 7.53 fps 2M splats / sec 24 Engine: 256x256x110 #Rendered splats: 247,577 10.28 fps 2.5M splats / sec 25 Lobster: 301x324x56 #Rendered splats: 555,976 10.60 fps 5.9M splats / sec 26 Video Our Contributions Adaptive EWA computation Volume data compression Retained-mode hardware acceleration Interactive opacity culling Deferred two-pass shading 28 Future Work Image-aligned EWA volume splatting Irregular volume splatting Pointsprites in OpenGL Floating point textures Vertex texture for classification 29 Acknowledgements Jessica Hodgins (CMU) Markus Gross (ETH) http://graphics.cs.cmu.edu/projects/adpewa/index.html 30