CS179: GPU Programming Lecture 9: Lab 4 Recitation Today 3D Textures PBOs Fractals Raytracing Lighting/Phong Shading Memory Coalescing 3D Textures Recall advantages of textures: Not global memory, faster accesses Still available to all threads/blocks Larger in size Better caching Filtering, clamping, sampling, etc. 3D Textures 3D textures store volume data Could be used for rendering smoke, particles, fractals, etc. Allocate a 3D cudaArray to make 3D texture cudaMalloc3DArray gives correct pitch Declare texture in device texture<type, 3, mode> tex Access on device with texture sampling tex3D(tex, x, y, z) 3D Textures Some texture properties you can set: tex.normalized: ints or normalized floats tex.filterMode: linear or point filtering tex.addressMode[x]: wrap or clamp (for each dimension) Bind texture to array cudaBindTextureToArray Unbinding is typical, but probably not necessary All of this is done for you in lab 4! PBOs Pixel Buffer Objects (PBOs) Store volume data Used to easily render in OpenGL Recall lab 3 VBOs stored vertex data Vertex data remained on GPU -- minimal transfer to/from CPU Rendered via OpenGL on GPU Same story here Pixels instead of verts, but same idea PBOs Initialize: glGenBuffersARB(target, &pbo) target is the target buffer object Bind to OpenGL: glBindBufferARB(target, pbo) Assign Data: glBufferDataARB(target, size, data, usage) data is a pointer to the data, usage tells us how often we read/write Map to CUDA: cudaGLMapBufferObject/cudaGLUnmapBufferObject Fractals Fractals: infinite complexity given by simple instructions “Self-similar, recursive” Difficult for us to process (but nice for a computer!) Many different kinds (we’ll look at Julia Set) How to render on CUDA: Calculate fractal volume or area Copy into texture Volume render as PBO What is a Julia Set? Mandlebrot Set “Father” of Julia Set Mandlebrot Set Mandlebrot Set Simpler than it looks Recursively defined by zn+1 = zn2 + c c is imaginary constant z0 = 0 Three possible results based on c: Converge to 0 (black space) Stays in finite orbit (boundary Escapes to infinite (blue area) Mandlebrot Set Computed by iteratively computing zn Assume after some point, it escapes and we can stop checking… ||zn|| > 2, for example Coloring is oftentimes based on rate of escape Don’t need more than a few dozen iterations to see behavior Demo? Julia Set Each pixel on Mandlebrot set has corresponding Julia Set Julia Set Idea: instead of starting with z0 = 0, let z0 = c0 c0 changing will change Julia Set dramatically! Julia Sets Why are they useful? Nothing really practical yet But they look cool! Can teach us about chaos, model weather, coastlines, etc. Parallelizable problem, so good for us! Julia Sets Lab 4 is even more exciting than Julia Sets… 4D Julia Sets! Julia Sets 4D: Using quaternions instead of imaginary Quaternions: 3D extension to imaginary numbers i2 = j2 = k2 = ijk = -1 ij = k = -ji, jk = i = -kj, ki = j = -ik Many uses in graphics Rotations Kinematics Visualizations Etc. We give you some nice quaternion functions (sqr_quat, mul_quat, etc.) Julia Sets How do we render 4D object? Projection: taking nD slices of an (n+1)D object Ex.: MRI Scan - 2D images of a 3D volume For 4D Julia set, render volume slices of 4D object Think of it as time evolving object Slice is one frame in time Now we have 3 parameters: z0 - starting point for Julia set c - constant for Mandlebrot set zp - slicing plane for projection Julia Sets How to render: Transform each coordinate in volume texture to quaternion q = (pos.x, pos.y, pos.z, dot((pos, 1), plane)) Implemented for you as pos_to_quat Store escape speed or convergence in volume texure Volume render - raytracing Raytracing Kind of what it sounds like: tracing rays Start at some origin ray.o Step in direction ray.d If we collide with something, render it! To check shadows, raytrace back toward light - if object hit, then in shadow Raytracing used for super high-def images Can also be used to calculate lighting, volumes, etc. Raytracing Raytracing Raytracing Might not work great for fractals Fractals are infinitely thin, so we might skip over many details Use distance function estimator Gives lower bound for distance to set from any point in space Let z’n also be iteratively computed as z’n+1= 2znz’n, z’0= (1,0,0,0) d(z) = | zn | log | zn | 2 | z 'n | Raytracing Rendering this distance function isosurface is okay Usage: Iterate zn and z’n until we escape or reach maximum iterations Return distance of previous slide Render all pixels “close enough” to set in volume Raytracing Better idea: use a bit of raytracing Load volume data with distances to set Store in volume texture Raytrace along a ray through texture Stop once we see distance is very low, under some epsilon Each ray handled by one thread, so pretty quick Raytracing Better raytracing: Current model: step along ray by step * ray.d step = some small constant, e.g. 0.005 What if we are 0.5 units away? Don’t need to step by 0.005 Use adaptive sampling: step = factor * dist factor = 0.01-0.5 works well No need to worry about thread divergence Raytracing Calculating ray: Inverse matrix needed to calculate where we are looking invViewMatrix given to you, calculated for you Pass it into constant memory c_invViewMatrix on GPU ray.o = invViewMat * (0, 0, 0, 1) ray.d = invViewMat * (u, v, -2.0) u, v are screen coordinates -- calculate these based on 2D thread index Lighting Once we hit fractal, render it! What color? Depends on lighting, shading model, material properties… You get to color based on however you like Something with some complexity would be good We suggest phong shading Phong Shading 3 Components: Ambient, diffuse, specular Phong Shading Ambient: Just a flat color amb = amb_color; Phong Shading Diffuse: Adds soft shadows and highlights based on normal diff = diff_color * cos(a) a is angle between light and surface normal Remember to use normalized vectors! N a L Phong Shading Specular: adds in bright highlights R eye spec = spec_color * dot(R, eye)S R is L reflected across N Eye = vector to eye S = shininess (weird: higher S = less shiny) N L Phong Shading Final output color is just sum of components: out = amb + diff + spec Main info we need to know: Light direction (chosen up to you, just hardcode) Normal (must compute) Eye vector (this is just -ray.d) Phong Shading Calculating Normal via gradient: sample volume texture For each component (x, y, z): Sample texture at component + some offset (x + 0.01) Sample texture at component - some offset (x - 0.01) Calculate difference per component Resulting differences are normal components! We can also directly sample d_juliaDist This can be pretty slow, but normals will be smoother Up to you, if you’d like Coalesced Memory Recap: coalesced memory gets better access rates Must be in-order, aligned, and together Comes into play with thread indexing index = threadIdx.x + blockDim.x * (blockIdx.x + gridDim.x blockIdx.y); index = threadIdx.x + blockDim.y * (blockIdx.y + gridDim.y blockIdx.x); * * Your Task Some prelab questions will be available All TODO code is in frac_kernel.cu Host code: Copy necessary memory into GPU Map/Unmap buffers Run kernels (2 this time, one to compute fractal, one to render) Use timer events to record time of recalculation Device code: d_setfractal: loads volume data with data from d_juliaDist d_juliaDist: returns estimated distance to Julia set at given point d_juliaNormal: samples texture to calculate normal d_render: raytraces to render isosurface, uses shading model to color fractal Your Task GPU architecture: Indexing is made easiest with 1D block, 2D grid Defined for you, see globally defined consts and dim3s Space is bounded by region [-2, 2]3 You’ll need to convert back and forth between this space and texture array indices Feel free to play with any architecture/setup In general, feel free to play with anything! Coloring can be really cool… Try other functions (z3 + c, for example) Your Task Extra Credit: Raytracing: use raytracing to render shadows (10pts) Once we hit surface, trace back toward light source If we hit surface again, the original surface pixel was in shadow, make it slightly darker Adaptive Detailing: higher detail when we’re zoomed in (5pts) Allows us to see the infiniteness of the fractal Essentially, just adjust epsilon based on distance to camera epsilon: how close we must be to fractal to be considered a “hit”