Interactive Deformation and Visualization of Level-Set Surfaces Using Graphics Hardware Aaron Lefohn Joe Kniss Charles Hansen Ross Whitaker Problem Statement Goal • Interactive system for manipulating level-set, deformable surfaces Level-Set Challenges • Computationally expensive • Difficult to control Solution • New streaming narrow-band algorithm • Unified computation and visualization Scientific Computing and Imaging Institute, University of Utah Overview Motivation and Introduction A Streaming Narrow-Band Solution 1. Virtual memory model 2. Substreams for static branch resolution 3. Efficient GPU-to-CPU message passing 4. Direct volume rendering of compressed/sparse data Application and Demo Conclusions Scientific Computing and Imaging Institute, University of Utah Level-Set Method Introduction Deformable, implicit surfaces • Surface deformation via partial differential equation • General, flexible model Segmentation Surface Processing Physical Simulation Tasdizen et al. IEEE Visualization 2002 Premoze et al. Eurographics 2003 Scientific Computing and Imaging Institute, University of Utah Level-Set Method Implicit surface • Distance transform • denotes inside/outside Surface motion • • F = Signed speed in direction of normal Scientific Computing and Imaging Institute, University of Utah Introduction Level-Set Acceleration Introduction Narrow-Band/Sparse-Grid • Compute PDE only near the isosurface – Adalsteinson et al. 1995 – Whitaker et al. 1998 – Peng et al. 1999 • Time-dependent, sparse-grid solver Initialize Domain Compute Update Domain Scientific Computing and Imaging Institute, University of Utah Level-Set Acceleration Graphics Hardware (GPU) Implementations • Strzodka et al. 2001 – 2D level-set solver on NVIDIA GeForce 2 • Lefohn et al. 2002 – 3D level-set solver on ATI Radeon 8500 – 1x – 2x faster than CPU, but 10x more computations – Unpublished work Scientific Computing and Imaging Institute, University of Utah Introduction Scientific Computing on GPU Introduction GPUs • Inexpensive, fast, data-parallel, streaming architecture • Parallel “For-Each” call over data elements • Combination of computation and visualization Texture Data Vertex & Texture Coordinates Vertex Processor Rasterizer Fragment Processor Frame/Pixel Buffer(s) Scientific Computing and Imaging Institute, University of Utah GPU Computational Capabilities • • • • Introduction 2D computational domain Restricted, data-parallel programming model Slow GPU-to-CPU communication Limited (high-bandwidth) memory on GPU Texture Data CPU Vertex & Texture Coordinates Vertex Processor Rasterizer Fragment Processor Scientific Computing and Imaging Institute, University of Utah Frame/Pixel Buffer(s) A Streaming Narrow-Band Algorithm Time-Dependent, Sparse Solver 1. 2D computational domain Multi-dimensional virtual memory model 2. Restricted, data-parallel programming model Substream resolution of fragment-level conditionals 3. Slow GPU-to-CPU communication Efficient message passing algorithm 4. Limited, high-bandwidth memory on GPU Direct volume rendering of level-set solution on GPU Scientific Computing and Imaging Institute, University of Utah Algorithm 1. Multi-Dimensional Virtual Memory Algorithm Virtual Memory • 3D virtual memory -- Level-set computation • 2D physical memory -- GPU optimizations • 16 x 16 pixel memory pages -- Locality / Memory usage Virtual Memory Space Physical Memory Space Unused Pages Inside Outside Active Pages Scientific Computing and Imaging Institute, University of Utah 1. Multi-Dimensional Virtual Memory Cooperation between CPU and GPU • CPU – Memory manager – Page table • GPU – Performs level-set computation – Issues memory requests Physical Addresses for Active Memory Pages GPU CPU PDE Computation 15-250 passes Memory Requests Scientific Computing and Imaging Institute, University of Utah Algorithm 2. Static Resolution of Conditionals Problem • Neighbor lookups across page boundaries • Branching slow on GPU Solution • Substreams – Create homogeneous data streams – Resolve conditionals with geometry : Points, Lines, Quads – Optimizes cache and pre-fetch performance Scientific Computing and Imaging Institute, University of Utah Algorithm 3. Efficient Message Passing Algorithm Algorithm Problem: Time-Dependent Narrow Band • GPU memory request mechanism • Low bandwidth GPU-to-CPU communication Solution • Compress GPU memory request • Use GPU computation to save GPU-to-CPU bandwidth Mipmapping s +x -x +y -y +z -z f Scientific Computing and Imaging Institute, University of Utah 4. Direct Volume Rendering of Level Set Render from 2D physical memory • Reconstruct 2D slice of virtual memory space • On-the-fly on GPU • Use 2D geometry and texture coordinates Scientific Computing and Imaging Institute, University of Utah Algorithm 4. Direct Volume Rendering of Level Set Algorithm Fully general volume rendering of compressed data • • • • Tri-linear interpolation 2D slice-based volume rendering Full transfer function and lighting capabilities No data duplication Scientific Computing and Imaging Institute, University of Utah Segmentation Application Extract feature from volume Two speed functions, FD and FH • Data-based speed, FD FD(I) FD= 0 I (Intensity) • Mean-curvature speed, FH – Smooth noisy solutions – Prevent “leaks” Scientific Computing and Imaging Institute, University of Utah Application Demo Application Segmentation of MRI volumes • 1283 scalar volume Details • ATI Radeon 9800 Pro • ARB_fragment_program ARB_vertex_program • 2.6 GHz Intel Xeon with 1 GB RAM Scientific Computing and Imaging Institute, University of Utah Region-of-Interest Volume Rendering Limit extent of volume rendering • Use level-set segmentation to specify region • Add level-set value to transfer function Scientific Computing and Imaging Institute, University of Utah Application GPU Narrow-Band: Performance Performance • 10x – 15x faster than optimized CPU version • Linear dependence on size of narrow band Bottlenecks • Fragment processor • Conservative time step – Need for global accumulation register (min, max, sum, etc.) Scientific Computing and Imaging Institute, University of Utah Results Summary Conclusions Interactive 3D Level-Set Computation/Visualization • Integrated segmentation and volume rendering • Intuitive parameter setting • Quantified effectiveness, user study (MICCAI 2003) Streaming Narrow-Band Solution 1. Virtual memory model 2. Substreams for static branch resolution 3. Efficient GPU-to-CPU message passing 4. Direct volume rendering of compressed/sparse data Scientific Computing and Imaging Institute, University of Utah Future Directions Conclusions Other level-set applications User interface Depth culling within active pages • Sherbondy et al. talk at 3:15pm today • “Fast Volume Segmentation With Simultaneous Visualization Using Programmable Graphics Hardware” N-D GPU virtual memory system • Separate memory layout from computation Scientific Computing and Imaging Institute, University of Utah Acknowledgements Gordon Kindlmann –- “Teem” raster-data toolkit Milan Ikits –- “Glew” OpenGL extension wrangler SCI faculty, students, and staff John Owens at UCDavis Evan Hart, Mark Segal, Arcot Preetham, Jeff Royle, and Jason Mitchell at ATI Technologies, Inc. Brigham and Women’s Hospital CIVM at Duke University Office of Naval Research grant #N000140110033 National Science Foundation grant #ACI008915 and #CCR0092065 Scientific Computing and Imaging Institute, University of Utah