GPU Shading and Rendering: OpenGL Shading Language Marc Olano UMBC OpenGL Shading • High level language – OpenGL Shading Language = GLslang = GLSL • Integrated into OpenGL API (no extra run-time) Organization • API • Vertex Shading • Fragment Shading • Lots of demos. – 2-year old Apple PowerBook G4/1.5GHz – ATI Mobility Radeon 9700 API-integrated • Compiler built into driver – Presumably they know your card best – IHV’s must produce (good) compilers • Use built-in parameters (glColor, glNormal, …) – Add your own • Other options can still produce low-level code – Cg, ASHLI, RapidMind, … – With loss of integration Using High-level Code • Create shader object S = glCreateShader(GL_VERTEX_SHADER) S = glCreateShaderObjectARB(GL_VERTEX_SHADER_ARB) – Vertex or Fragment • Load shader into object glShaderSource(S, n, shaderArray, lenArray) glShaderSourceARB(S, n, shaderArray, lenArray) – Array of strings • Compile object glCompileShader(S) glCompileShaderARB(S) Loading Shaders • glShaderSource(S, n, shaderArray, lenArray) – One string containing entire mmap’d file – Strings as #includes • Varying variables between vertex and fragment – Strings as lines • Null-terminated if lenArray is Null or length=-1 Using High-level Code (2) • Create program object P = glCreateProgram() P = glCreateProgramObjectARB() • Attach all shader objects glAttachShader(P, S) glAttachObjectARB(P, S) – Vertex, Fragment or both • Link together glLinkProgram(P) glLinkProgramARB(P) • Use glUseProgramObject(P) glUseProgramObjectARB(P) Using High-level Code (3) • Where is my attributes/uniforms parameter? i=glGetAttribLocation(P,”myAttrib”) i=glGetUniformLocation(P,”myAttrib”) • Set them glVertexAttrib1f(i,value) glVertexAttribPointer(i,…) glUniform1f(i,value) Using Low-level Code • Load shader glProgramStringARB(GL_VERTEX_PROGRAM_ARB, GL_PROGRAM_FORMAT_ASCII_ARB, length, shader) – Vertex or fragment – Single string (vs. array) • Enable glEnable(GL_VERTEX_PROGRAM_ARB) Useful Tools • Shader debugger • OpenGL debugger – Immediate updates – Trace of calls made – Choose model/texture – Examine resources – Tweak parameters – Breakpoints/actions – Examine/dump frames – Graph performance • Several available – Not hard to build • A couple of choices gDEBugger – A Professional OpenGL Debugger and Profiler • Provides graphic pipeline information needed to find bugs and to optimize application performance: – Shortens debugging and profiling time – Improves application quality – Optimizes application performance Free gDEBugger License for Academic Users! • OpenGL ARB and Graphic Remedy Academic Program: – Annual program for all OpenGL Academic users – License of the full feature version for one year – Includes all software updates – A limited number of free licenses available for non-commercial developers who are not in academia • More details: http://academic.gremedy.com Non-windows OS • Linux – gDEBugger in progress • Apple OpenGL Profiler and Driver Monitor – Free part of OS / Developer tools Vertex Demo: Blend Positions High-level Code void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Main Function void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Use Standard OpenGL State void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Built-in Types void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Swizzle / Channel Selection void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Vector Construction void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Built-in Functions void main() { float Kin = gl_Color.r; // key input // screen position from vertex and texture vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0.,1.); // interpolate between Vp and Tp gl_Position = mix(Tp,Vp,pow(1.-Kin,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = vec4(Kin); } Vertex + Fragment Demo: Fresnel Environment Map Trick #1: Where is the Eye Object Space • ModelView Matrix Eye Space Projection Matrix Clip Space Where is the Eye in Eye Space? – (0,0,0)? Not necessarily! • Know where it is in Clip Space – (0,0,-1,0), looking in the (0,0,1,0) direction • Assuming GL_LESS depth test – Invert projection to find the eye! – Works for any eye position, or even parallel projection. Trick #2: Subtract Homogeneous Points • Homogeneous point: vec4(V.xyz, V.w) – 3D equivalent: V.xyz/V.w – Defers division, makes perspective, translation, and many things happy • Vector subtraction: V–E – V.xyz/V.w – E.xyz/E.w – (V.xyz*E.w – E.xyz*V.w)/(V.w*E.w) Trick #3: Skip Division for Normalize • normalize(V.xyz/V.w) = normalize(V.xyz) – If V.w isn’t negative • Put it all together: – normalize(V-E) – = normalize(V.xyz*E.w - E.xyz*V.w) OpenGL State Demo: Vertex Lighting Lighting Vectors in Eye Space void main() { // convert shading-related vectors to eye space vec4 P = gl_ModelViewMatrix*gl_Vertex; vec4 E = gl_ProjectionMatrixInverse*vec4(0,0,-1,0); vec3 V = normalize(E.xyz*P.w-P.xyz*E.w); vec3 N = normalize( gl_NormalMatrix*gl_Normal) ; … Accumulate Each Light … // accumulate contribution from each light gl_FrontColor = vec4(0); for(int i=0; i<gl_MaxLights; i++) { vec3 L = normalize(gl_LightSource[i].position.xyz*P.w P.xyz*gl_LightSource[i].position.w); vec3 H = normalize(L+V); float diff = dot(N,L); gl_FrontColor += gl_LightSource[i].ambient; if (diff > 0.) { gl_FrontColor += gl_LightSource[i].diffuse * diff; gl_FrontColor += gl_LightSource[i].specular * max(pow(dot(N,H), gl_FrontMaterialShininess),0.); } } … Standard Vertex Shader Stuff … // standard texture coordinate and position stuff gl_TexCoord[0] = gl_TextureMatrix[0]*gl_MultiTexCoord0; gl_Position = ftransform(); } Noise • Controlled, repeatable randomness – Still spotty implementation – Can use texture or compute Noise Characteristics • Repeatable • Locally continuous but distant points uncorrolated • values [-1,1], average 0 • 1/2 – 1 cycle per unit • Versions for n-D input Noise Subtleties • Many noise functions based on a lattice – Like a spline between integer coordinates – Hash of integer coordinates control points • Interpolating values easy but poor – Even with higher-order interpolation • Perlin’s noise – Passes through 0 at each integer – Hash gives gradient Modified Noise [Olano 2005] • Three relatively independent modifications – New computable hash – Change gradient computation – Reorder computation • Variety of computation/texture options – Can just store in a texture – Can compute with some texture accesses – Can compute with no texture accesses Computable Hash • Normal hash chains access to permutation texture • Want totally computable hash – mod(k*x2, m) – Still chain for higher-D • hash(floor(P.x) + hash(floor(P.y))) – Not quite as good, but cheap & computable • Noise usually not used alone Gradient • 3D Gradient = (±fract(P.x), ±fract(P.y)) – Each sign from one bit of hash • Made slightly more difficult without bitwise ops – Allows noise(x) = noise(x,0) • If 2D noise is stored in a texture • Can share the same texture for 1D noise as well – Not normally true! Reordered Computation • Refactor to be able to build n-D noise from two shifted calls to n-1 D noise – If 2D noise is stored in a texture – Can build 3D noise from 2 texture accesses – Can build 4D noise from 4 texture accesses Shader Design Strategies • Learn and adapt from RenderMan – Noise – Layers • Multiple Passes • Baked computation