Programmable Graphics Hardware David Luebke 1 7/27/2016 Recap: How are current GPU’s different from CPU? 1. GPU is a stream processor ■ Multiple programmable processing units ■ Connected by data flows 2 Fragment Processor Textures Framebuffer Vertex Processor Framebuffer Operations David Luebke Assembly & Rasterization Application 7/27/2016 Cg uses separate vertex and fragment programs Fragment Processor Framebuffer Vertex Processor Framebuffer Operations Assembly & Rasterization Application Textures Program Program David Luebke 3 7/27/2016 Cg programs have two kinds of inputs ● Varying inputs (streaming data) ■ e.g. normal vector – comes with each vertex ■ This is the default kind of input ● Uniform inputs (a.k.a. graphics state) ■ e.g. modelview matrix ● Note: Outputs are always varying vout MyVertexProgram(float4 normal, uniform float4x4 modelview) { … David Luebke 4 7/27/2016 Two ways to bind VP outputs to FP inputs a) Let compiler do it ■ ■ ■ Define a single structure Use it for vertex-program output Use it for fragment-program input struct vout { float4 color; float4 texcoord; … }; David Luebke 5 7/27/2016 Two ways to bind VP outputs to FP inputs Do it yourself b) ■ ■ ■ ■ Specify register bindings for VP outputs Specify register bindings for FP inputs May introduce HW dependence Necessary for mixing Cg with assembly struct vout { float4 color : TEX3 ; float4 texcoord : TEX5; … }; David Luebke 6 7/27/2016 Some inputs and outputs are special ● e.g. the position output from vert prog ■ This output drives the rasterizer ■ It must be marked struct vout { float4 color; float4 texcoord; float4 position : HPOS; }; David Luebke 7 7/27/2016 How are current GPU’s different from CPU? 2. Greater variation in basic capabilities ■ ■ ■ • • • • Most processors don’t yet support branching Vertex processors don’t support texture mapping Some processors support additional data types Compiler can’t hide these differences Least-common-denominator is too restrictive We expose differences via language profiles (list of capabilities and data types) Over time, profiles will converge David Luebke 8 7/27/2016 How are current GPU’s different from CPU? 3. Optimized for 4-vector arithmetic ■ ■ • • • • Useful for graphics – colors, vectors, texcoords Easy way to get high performance/cost C philosophy says: expose these HW data types Cg has vector data types and operations e.g. float2, float3, float4 Makes it obvious how to get high performance Cg also has matrix data types e.g. float3x3, float3x4, float4x4 David Luebke 9 7/27/2016 Some vector operations // // Clamp components of 3-vector to [minval,maxval] range // float3 clamp(float3 a, float minval, float maxval) { a = (a < minval.xxx) ? minval.xxx : a; a = (a > maxval.xxx) ? maxval.xxx : a; return a; } ? : is per-component for vectors Swizzle – replicate and/or rearrange components. Comparisons between vectors are per-component, and produce vector result David Luebke 10 7/27/2016 Cg has arrays too ● Declared just as in C ● But, arrays are distinct from built-in vector types: float4 != float[4] ● Language profiles may restrict array usage vout MyVertexProgram(float3 lightcolor[10], …) { … David Luebke 11 7/27/2016 How are current GPU’s different from CPU? 4. No support for pointers ■ 5. Arrays are first-class data types in Cg No integer data type ■ ■ David Luebke Cg adds “bool” data type for boolean operations This change isn’t obvious except when declaring vars 12 7/27/2016 Cg basic data types ● All profiles: ■ float ■ bool ● All profiles with texture lookups: ■ sampler1D, sampler2D, sampler3D, samplerCUBE ● NV_fragment_program profile: ■ half -- half-precision float ■ fixed -- fixed point [-2,2) David Luebke 13 7/27/2016 Other Cg capabilities ● Function overloading ● Function parameters are value/result ■ Use “out” modifier to declare return value void foo (float a, out float b) { b = a; } ● “discard” statement – fragment kill if (a > b) discard; David Luebke 14 7/27/2016 Cg Built-in functions ● Texture mapping (in fragment profiles) ● Math ■ Dot product ■ Matrix multiply ■ Sin/cos/etc. ■ Normalize ● Misc ■ Partial derivative (when supported) ● See spec for more details David Luebke 15 7/27/2016 New vector operators ● Swizzle – replicate/rearrange elements a = b.xxyy; ● Write mask – selectively over-write a.w = 1.0; ● Vector constructor builds vector a = float4(1.0, 0.0, 0.0, 1.0); David Luebke 16 7/27/2016 Change to constant-typing mechanism ● In C, it’s easy to accidentally use high precision half x, y; x = y * 2.0; // Double-precision multiply! ● Not in Cg x = y * 2.0; // Half-precision multiply ● Unless you want to x = y * 2.0f; // Float-precision multiply David Luebke 17 7/27/2016 Dot product, Matrix multiply ● Dot product ■ dot(v1,v2); // returns a scalar ● Matrix multiplications: ■ matrix-vector: mul(M, v); // returns a vector ■ vector-matrix: mul(v, M); // returns a vector ■ matrix-matrix: mul(M, N); // returns a matrix David Luebke 18 7/27/2016 Demos and Examples ● Show Assn 4a skeleton code ● Show Cg effects browser ( ■ Fresnel ■ Simple lighting David Luebke 19 7/27/2016 Cg runtime API helps applications use Cg ● Compile a program ● Select active programs for rendering ● Pass “uniform” parameters to program ● Pass “varying” (per-vertex) parameters ● Load vertex-program constants ● Other housekeeping David Luebke 20 7/27/2016 Runtime is split into three libraries ● API-independent layer – cg.lib ■ Compilation ■ Query information about object code ● API-dependent layer – cgGL.lib and cgD3D.lib ■ Bind to compiled program ■ Specify parameter values ■ etc. ● NB: New API introduced since the following slide ■ See user’s manual for details ■ Pay attention to the basic idea here David Luebke 21 7/27/2016 Runtime API for OpenGL // Create cgContext to hold vertex-profile code VertexContext = cgCreateContext(); // Add vertex-program source text to vertex-profile context // This is where compilation currently occurs cgAddProgram(VertexContext, CGVertProg, cgVertexProfile, NULL); // Get handle to 'main' vertex program VertexProgramIter = cgProgramByName(VertexContext, "main"); cgGLLoadProgram(VertexProgramIter, ProgId); VertKdBind = cgGetBindByName(VertexProgramIter, "Kd"); TestColorBind = cgGetBindByName(VertexProgramIter, "I.TestColor"); texcoordBind = cgGetBindByName(VertexProgramIter, "I.texcoord"); David Luebke 22 7/27/2016 Runtime API for OpenGL // // Bind uniform parameters // cgGLBindUniform4f(VertexProgramIter, VertKdBind, 1.0, 1.0, 0.0, 1.0); … // Prepare to render cgGLEnableProgramType(cgVertexProfile); cgGLEnableProgramType(cgFragmentProfile); … // Immediate-mode vertex glNormal3fv(&CubeNormals[i][0]); cgGLBindVarying2f(VertexProgramIter, texcoordBind, 0.0, 0.0); cgGLBindVarying3f(VertexProgramIter, TestColorBind, 1.0, 0.0, 0.0); glVertex3fv(&CubeVertices[CubeFaces[i][0]][0]); David Luebke 23 7/27/2016 Cg Summary ● C-like language ● With capabilities for GPU’s ● Compatible with Microsoft’s HLSL ● Use with OpenGL or DirectX ● NV20/DX8 and beyond ● NV30 + Cg = You control the graphics pipeline David Luebke 24 7/27/2016 More Information ● NVIDIA’s “Learn About Cg” page: ■ http://developer.nvidia.com/view.asp?IO=cg_about ● For information, inspiration, and examples, a web forum on writing Cg shaders: ■ http://www.cgshaders.org ● The Cg user’s manual: ■ http://developer.nvidia.com/view.asp?IO=cg_users_manual ● Cg downloads—compiler, plug-ins, etc. ■ http://developer.nvidia.com/view.asp?IO=cg_toolkit David Luebke 25 7/27/2016