Rendering Countless Blades of Waving Grass

advertisement
Jeff Schmidt
CS 680
Nature scenes are a prevalent topic in
computer graphics (for example, computer
games)
In addition, to realistic trees and flowing water
effects, we want to render a high-quality
grass effect in real-time that looks realistic
from all angles
In general, grass needs to cover a vast amount
of the scene. This makes modeling each
individual blade of grass with polygons
unrealistic
A simple, flat grass texture will only look
realistic from certain angles
We create a “star” organization of quads and
cover them with grass textures
The star formation gives us the same effect no
matter which side of the grass object we are
viewing
Finally, we want to simulate a wind effect to
make the grass feel more realistic
To simulate the blowing of grass, we shift the
upper vertices of our grass object
I have implemented both a CPU only and a GPU
version of the program
I use OpenGL/GLUT for rendering
I used the GPU to speed up various tasks
I have been running my project on double/float
CPU: Intel(R) Xeon(R) CPU
GPU: GeForce GTX 580
Timings were taken using cudaEvent’s
No compiler flags
For the grass texture, I use .ppm files for their
simple r, g, b, r, g, b,… file format
However, ppm files do not support an alpha
channel
Solution: Pick a color that does not appear in
any textures and fill the background with it.
Then parse the texture file, and fill in alpha
values where you used the “transparent”
color
Each (r, g, b) triple is independent, therefore,
splitting it amongst threads is simple
We only read each r, g, b value once, and we
only write each alpha value once
Therefore, I used zero-copy host memory to
eliminate copying the texture from the CPU
to the GPU and back
Experiment ran with varying texture sizes. GPU
version has 64 blocks, 64 threads
Timings
900
800
CPU
700
600
GPU
500
Time (ms)
400
300
200
100
0
100
200
300
Texture dimensions
400
500
GPU - Speedup
1.6
1.4
1.2
1
Speedup 0.8
0.6
0.4
0.2
0
100
200
300
Texture dimensions
400
500
Again, each grass object is independent of one
another, therefore, each thread can create it’s
own set of grass objects in parallel
Each grass object has an initial position that is
perturbed by some random value, which
gives slightly non-uniform distribution
PROBLEM: Creating a random number on the
GPU
To generate random numbers, I create an array
on the host filled with random numbers
Then I place the array in texture memory
Each block/thread indexes into the texture to
retrieve the desired random numbers
Experiment ran with varying numbers of grass
objects. GPU version has 64 blocks, 64 threads
Timings
100
90
80
CPU
70
60
Time (ms) 50
40
GPU
30
20
10
0
10000
20000
30000
Number of Grass Objects
40000
50000
GPU - Speedup
1.85
1.8
1.75
1.7
1.65
Speedup 1.6
1.55
1.5
1.45
1.4
1.35
10000
20000
30000
Number of Grass Objects
40000
50000
Create random wind vectors crossing the
viewing area
Calculate vector to shift the grass objects by
measuring their distance from wind vector
After the wind blows, “spring” back towards
resting position
Include some slight randomness, so that all
grass doesn’t move exactly the same
speed/direction
Unfortunately, due to time, I was unable to
simulate wind
Instead, my grass objects wave randomly



My “wind” simulation actually runs slower on
the GPU version, due to creating the texture
of random numbers and copying them to the
GPU
For this reason, I did not include any timings
For a more in-depth wind simulation, the
GPU version would likely out-perform the
CPU

Strategies to get random numbers from the
GPU:
 Store them in a texture generated by the host
 Implement a simple psuedo-random number
generator that runs on the GPU (example: linear
congruential generators)

I got to implement some new (for me) CUDA
features
 Zero-copy host memory – used for updating my
textures with transparency
 Texture memory – used for storing random
numbers and passing them to the GPU

Use __host__ __device__ for functions you
want on both the CPU and GPU


I would have liked to use CUDA’s
interoperability features with OpenGL to
have the GPU render the scene without going
through the CPU
I would have liked to explore CUDA streams
for overlapping of GPU calculations and
memory copying

Due to time constraints, I was unable to
implement wind simulation. I would have
liked to use a real simulation formula and
better grass textures to make the scene look
more realistic
Download