CS179: GPU Programming
Lecture 9: Lab 4 Recitation
 3D Textures
 PBOs
 Fractals
 Raytracing
 Lighting/Phong Shading
 Memory Coalescing
3D Textures
 Recall advantages of textures:
Not global memory, faster accesses
Still available to all threads/blocks
Larger in size
Better caching
Filtering, clamping, sampling, etc.
3D Textures
 3D textures store volume data
 Could be used for rendering smoke, particles, fractals, etc.
 Allocate a 3D cudaArray to make 3D texture
 cudaMalloc3DArray gives correct pitch
 Declare texture in device
 texture<type, 3, mode> tex
 Access on device with texture sampling
 tex3D(tex, x, y, z)
3D Textures
 Some texture properties you can set:
 tex.normalized: ints or normalized floats
 tex.filterMode: linear or point filtering
 tex.addressMode[x]: wrap or clamp (for each dimension)
 Bind texture to array
 cudaBindTextureToArray
 Unbinding is typical, but probably not necessary
 All of this is done for you in lab 4!
 Pixel Buffer Objects (PBOs)
 Store volume data
 Used to easily render in OpenGL
 Recall lab 3
 VBOs stored vertex data
 Vertex data remained on GPU -- minimal transfer to/from CPU
 Rendered via OpenGL on GPU
 Same story here
 Pixels instead of verts, but same idea
 Initialize:
 glGenBuffersARB(target, &pbo)
 target is the target buffer object
 Bind to OpenGL:
 glBindBufferARB(target, pbo)
 Assign Data:
 glBufferDataARB(target, size, data, usage)
 data is a pointer to the data, usage tells us how often we
 Map to CUDA:
 cudaGLMapBufferObject/cudaGLUnmapBufferObject
 Fractals: infinite complexity given by simple instructions
 “Self-similar, recursive”
 Difficult for us to process (but nice for a computer!)
 Many different kinds (we’ll look at Julia Set)
 How to render on CUDA:
 Calculate fractal volume or area
 Copy into texture
 Volume render as PBO
 What is a Julia Set?
Mandlebrot Set
 “Father” of Julia Set
Mandlebrot Set
Mandlebrot Set
 Simpler than it looks
 Recursively defined by zn+1 = zn2 + c
 c is imaginary constant
 z0 = 0
 Three possible results based on c:
 Converge to 0 (black space)
 Stays in finite orbit (boundary
 Escapes to infinite (blue area)
Mandlebrot Set
 Computed by iteratively computing zn
 Assume after some point, it escapes
and we can stop checking…
 ||zn|| > 2, for example
 Coloring is oftentimes based on rate
of escape
 Don’t need more than a few dozen
iterations to see behavior
 Demo?
Julia Set
 Each pixel on Mandlebrot set has corresponding Julia Set
Julia Set
 Idea: instead of starting with z0 = 0, let z0 = c0
 c0 changing will change Julia Set dramatically!
Julia Sets
 Why are they useful?
Nothing really practical yet
But they look cool!
Can teach us about chaos, model weather, coastlines, etc.
Parallelizable problem, so good for us!
Julia Sets
 Lab 4 is even more exciting than Julia Sets…
 4D Julia Sets!
Julia Sets
 4D: Using quaternions instead of imaginary
 Quaternions: 3D extension to imaginary numbers
 i2 = j2 = k2 = ijk = -1
 ij = k = -ji, jk = i = -kj, ki = j = -ik
 Many uses in graphics
 We give you some nice quaternion functions (sqr_quat,
mul_quat, etc.)
Julia Sets
 How do we render 4D object?
 Projection: taking nD slices of an (n+1)D object
 Ex.: MRI Scan - 2D images of a 3D volume
 For 4D Julia set, render volume slices of 4D object
 Think of it as time evolving object
 Slice is one frame in time
 Now we have 3 parameters:
 z0 - starting point for Julia set
 c - constant for Mandlebrot set
 zp - slicing plane for projection
Julia Sets
 How to render:
 Transform each coordinate in volume texture to quaternion
 q = (pos.x, pos.y, pos.z, dot((pos, 1), plane))
 Implemented for you as pos_to_quat
 Store escape speed or convergence in volume texure
 Volume render - raytracing
 Kind of what it sounds like: tracing rays
Start at some origin ray.o
Step in direction ray.d
If we collide with something, render it!
To check shadows, raytrace back toward
light - if object hit, then in shadow
 Raytracing used for super high-def
 Can also be used to calculate lighting,
volumes, etc.
 Might not work great for fractals
 Fractals are infinitely thin, so we might skip over many details
 Use distance function estimator
 Gives lower bound for distance to set from any point in space
 Let z’n also be iteratively computed as z’n+1= 2znz’n, z’0= (1,0,0,0)
d(z) =
| zn |
log | zn |
2 | z 'n |
 Rendering this distance function isosurface is okay
 Usage:
 Iterate zn and z’n until we escape or reach maximum iterations
 Return distance of previous slide
 Render all pixels “close enough” to set in volume
 Better idea: use a bit of raytracing
 Load volume data with distances to set
 Store in volume texture
 Raytrace along a ray through texture
 Stop once we see distance is very low, under some epsilon
 Each ray handled by one thread, so pretty quick
 Better raytracing:
 Current model: step along ray by step * ray.d
 step = some small constant, e.g. 0.005
 What if we are 0.5 units away?
 Don’t need to step by 0.005
 Use adaptive sampling:
 step = factor * dist
 factor = 0.01-0.5 works well
 No need to worry about thread divergence
 Calculating ray:
 Inverse matrix needed to calculate where we are looking
 invViewMatrix given to you, calculated for you
 Pass it into constant memory c_invViewMatrix on GPU
 ray.o = invViewMat * (0, 0, 0, 1)
 ray.d = invViewMat * (u, v, -2.0)
 u, v are screen coordinates -- calculate these based on 2D thread index
 Once we hit fractal, render it!
 What color?
 Depends on lighting, shading model, material properties…
 You get to color based on however you like
 Something with some complexity would be good
 We suggest phong shading
Phong Shading
 3 Components: Ambient, diffuse, specular
Phong Shading
 Ambient: Just a flat color
 amb = amb_color;
Phong Shading
 Diffuse: Adds soft shadows and
highlights based on normal
 diff = diff_color * cos(a)
 a is angle between light and surface
 Remember to use normalized vectors!
Phong Shading
 Specular: adds in bright highlights
spec = spec_color * dot(R, eye)S
R is L reflected across N
Eye = vector to eye
S = shininess (weird: higher S = less shiny)
Phong Shading
 Final output color is just sum of components:
 out = amb + diff + spec
 Main info we need to know:
 Light direction (chosen up to you, just hardcode)
 Normal (must compute)
 Eye vector (this is just -ray.d)
Phong Shading
 Calculating Normal via gradient: sample volume texture
 For each component (x, y, z):
Sample texture at component + some offset (x + 0.01)
Sample texture at component - some offset (x - 0.01)
Calculate difference per component
Resulting differences are normal components!
 We can also directly sample d_juliaDist
 This can be pretty slow, but normals will be smoother
 Up to you, if you’d like
Coalesced Memory
 Recap: coalesced memory gets better
access rates
 Must be in-order, aligned, and together
 Comes into play with thread indexing
 index = threadIdx.x +
blockDim.x *
(blockIdx.x + gridDim.x
 index = threadIdx.x +
blockDim.y *
(blockIdx.y + gridDim.y
Your Task
 Some prelab questions will be available
 All TODO code is in frac_kernel.cu
 Host code:
Copy necessary memory into GPU
Map/Unmap buffers
Run kernels (2 this time, one to compute fractal, one to render)
Use timer events to record time of recalculation
 Device code:
d_setfractal: loads volume data with data from d_juliaDist
d_juliaDist: returns estimated distance to Julia set at given point
d_juliaNormal: samples texture to calculate normal
d_render: raytraces to render isosurface, uses shading model to color
Your Task
 GPU architecture:
 Indexing is made easiest with 1D block, 2D grid
 Defined for you, see globally defined consts and dim3s
 Space is bounded by region [-2, 2]3
 You’ll need to convert back and forth between this space and
texture array indices
 Feel free to play with any architecture/setup
 In general, feel free to play with anything!
 Coloring can be really cool…
 Try other functions (z3 + c, for example)
Your Task
 Extra Credit:
 Raytracing: use raytracing to render shadows (10pts)
 Once we hit surface, trace back toward light source
 If we hit surface again, the original surface pixel was in shadow, make it
slightly darker
 Adaptive Detailing: higher detail when we’re zoomed in (5pts)
 Allows us to see the infiniteness of the fractal
 Essentially, just adjust epsilon based on distance to camera
 epsilon: how close we must be to fractal to be considered a “hit”