Particle Systems GPU Graphics Sample Particle System Water Fire and Smoke What is a particle system? One of the original uses was in the movie Star Trek II William Reeves (implementor) Each Particle Had: • Position • Velocity • Color • Lifetime • Age • Shape • Size • Transparency Particle system from Star Trek II: Wrath of Khan Types of Particle Systems Stateless Particle System – A particle data is computed from birth to death by a closed form function defined by a set of start values and a current time. (does not react to dynamic environment) State Preserving Particle System – Uses numerical iterative integration methods to compute particle data from previous values and changing environmental descriptions. Stateless Particle Systems on GPU Stateless Simulation – Computed particle data by closed form functions No reaction on dynamically changing environment! No Storage of varying data (sim. in vertex prog) Particle Birth Rendering Upload time of birth & initial Values to dynamic vertex buffer Set global function parameter as vertex program constants Render point sprites/triangles/quads With particle system vertex program Particle Life Cycle Generation – Particles are generated randomly within a predetermined location Particle Dynamics – The attributes of a particle may vary over time. Based upon equations depending on attribute. Extinction Age – Time the particle has been alive Lifetime – Maximum amount of time the particle can live. Premature Extinction: Running out of bounds Hitting an object (ground) Attribute reaches a threshold (particle becomes transparent) Rendering Rendered as a graphics primitive (blob) Particles that map to the same pixels are additive Sum the colors together No hidden surface removal Motion blur is rendered by streaking based on the particles position and velocity Issues with Particle Systems on the CPU CPU based particle systems are limited to about 10,000 particles per frame in most game applications Limiting Factor is CPU-GPU communication PCI Express transfer rate is about 3 gigabytes/sec State Preserving Algorithm: Rendering Passes: Process Birth and Deaths Update Velocities Update Positions Sort Particles (optional, takes multiple passes) Transfer particle positions from pixel to vertex memory Render particles Data Storage: Two Textures (position and velocity) Use texture pair and double buffering to compute new data from previous data Total number of textures needed ? Storage Types Each holds an x,y,z component Conceptually a 1d array Stored in a 2d texture (why) Velocity can be stored using 16bit floats Size, Color, Opacity, etc. Simple attributes, can be added later, usually computed using the stateless method Birth and Death Birth = allocation of a particle Associate new data with an available index in the attributes textures Serial process – offloaded to CPU Initial particle data determined on CPU also Death = deallocation of a particle Must be processed on CPU and GPU CPU – frees the index associated with particle GPU – extra pass to move any dead particles to unseen areas (i.e. infinity, or behind the camera) In practice particles fade out or fall out of view (Clean-up rarely needs to be done) Allocation on CPU Stack Heap 5 2 30 10 15 26 11 19 12 6 33 2 Easier! 10 5 11 15 6 12 26 19 33 Optimize heap to always return smallest available index Why? Better! 30 Update Velocities Velocity Operations Global Forces Local Forces Wind Gravity Attraction Repulsion Velocity Damping Collision Detection F = Σ f0 ... fn F = ma a = F/m If m = 1 F=a Local Force – Flow Field Stokes Law of drag force on a sphere Fd = 6Πηr(v-vfl) η = viscosity r = radius of sphere C = 6Πηr (constant) v = particle velocity vfl = flow velocity Sample Flow Field Other velocity operations Damping Imitates viscous materials or air resistance Implement by downward scaling velocity Un-damping Self-propelled objects (bee swarms) Implement by upward scaling velocity Collisions Collisions against simple objects Walls Bounding Spheres Collision against complex objects Terrain Complex objects Terrain is usually modeled as a texture-based height field Collision Reaction Vn vn = (vbc * n)vbc vt = vbc – vn V Vt vbc = velocity before collision vn = normal component of velocity vt = tangental component of velocity V = (1-μ)vt – εvn μ = dynamic friction (affects tangent velocity) ε = resilience (affects normal velocity) Update Positions Euler Integration p = pprev + v * Δt For simple velocity updates Verlet pi+1 = pi + (pi– pi-1) + a * Δt2 Sorting for Alpha Blending Odd-even merge sort Good for GPU because it always runs in constant time for a given data size Guarantees that with each iteration sortedness never decreases Inefficient to check whether sort is done on GPU Full sort can be distributed over 20-50 frames so it doesn’t slow down system too much (acceptable visual errors) Sorting will be covered in lecture 8 Questions?