Introduction to Geometry Shaders Patrick Cozzi Analytical Graphics, Inc. Overview Geometry Shaders in the Pipeline Primitive Types Applications Performance Birds Eye View Create or destroy primitives on the GPU Requires DirectX 10 or GL_ARB_geometry_shader4 Geometry Shader Geometry Shaders in the Pipeline Vertices in world coordinates Vertex Shader Perspective Divide Fragment and Shader Viewport Transformation clip coordinates Fragment Shader window coordinates Geometry Shaders in the Pipeline window coordinates clip coordinates Vertex Shader clip coordinates Primitive Geometry Assembly Shader PD Geometry and Shader VT PD PD Fragment Fragment and and Clipping Shader Shader VT VT Fragment Shader Primitive Types Geometry Shader Output primitives can be disconnected Primitive Types Input Primitives GL_POINTS GL_LINES GL_TRIANGLES Adjacency Output Primitives GL_POINTS GL_LINE_STRIP GL_TRIANGLE_STRIP Primitive Types Input primitive type doesn’t have to equal output primitive type blogs.agi.com/insight3d/index.php/2008/10/23/geometry-shader-for-debugging-normals/ Applications Implement glPolygonMode Triangles Points or Line Strips Emulate GL_ARB_point_sprite Points Triangle Strips Applications Displacement Mapping Single pass generate a cube map Extrusions Shadow volumes Fins along silhouettes for fur rendering Applications: Fur in Lost Planet Render surface, write buffers for Fur Color Angle Length GS turns each pixel into a translucent polyline Automatic LOD Applications: Fur in Lost Planet color angle length Images from meshula.net/wordpress/?p=124 Performance Duplicates per-vertex operations for vertices shared by primitives 5 vertices processed 9 vertices processed Vertex Shader Geometry Shader Performance Must guarantee order in == order out Geometry Shader Performance Order guarantee affects parallelism Geometry Shader Geometry Shader Geometry Shader Reorder Buffer Clipping Performance Buffer size needs to support a number of threads running in parallel Performance Maximum number of vertices a GS will output: GEOMETRY_VERTICES_OUT_ARB Determines the speed of GS execution Make this and vertex sizes as small as possible Performance GeForce 8, 9, and GTX2xx Output size = vertex size * GEOMETRY_VERTICES_OUT_ARB Maximum output size: 1,024 scalars Performance is inversely proportional to output size Not a continuous function: • 1-20 scalars: Peak Performance • 27-40 scalars: 50% Performance • On GeForce 8800 GTX Performance Benefits Reduces vertex buffer memory usage • Compute in GS, e.g. normals • Create more geometry • No need to duplicate (e.g. compared to equivalent VS implementation) Less memory == less bus traffic Reduces vertex attribute setup cost Summary Modify incoming primitive or make a limited number of copies Not for Large scale amplification Instancing Resources Introduction to Direct3D 10 SIGGRAPH 2007 Course Notes www.microsoft.com/downloads/details.aspx?FamilyId=96CD28D5-4C15-475E-A2DC-1D37F67FA6CD&displaylang=en Resources GL_ARB_geometry_shader4 www.opengl.org/registry/specs/ARB/geometry_shader4.txt Resources Section 3.5 www.realtimerendering.com Resources Section 4.6 developer.nvidia.com/object/gpu_programming_guide.html