Space Partitioning

advertisement
Space Partitioning
Computer Graphics
What is Space Partitioning?
• A division of 3D space into distinct regions
• Specifically, a tree data structure is used to
divide up the world in a hierarchical manner
– The tree leaf nodes usually contain pieces
(triangles) of the world
– Each leaf holds onto the polygons that are in its
sub-piece of the world
Why Perform Space Partitioning?
• Speed
– For example, an accurate model of a Boeing-777
contains over 500M triangles
• You do not want to send 500M tri down the rendering pipeline
– We have seen various mapping techniques to make the
model look good with less triangles
– But we can also cut down on the number of triangles by
only sending triangles down the pipeline that are in the
field of view of the camera
• By dividing the world into pieces we can cull/clip
on large scale rather than per triangle as the
rendering pipeline does
Types of Space Partitioning
• The two main types we will explore are:
– Octrees
– Binary Space Partition (BSP) trees
• However, in the process we will also cover:
–
–
–
–
Frustum culling
Bounding Volumes (BVs)
Potentially Visible Sets (PVS)
Scene Graphs
Frustum Culling
• In the classic rendering pipeline front-facing
triangles are clipped against the Frustum
– All 3 vertices outside the volume implies the
triangle is culled (not sent down the pipeline)
• The problem with this technique is that it is
triangle-based and doesn’t take into account
spatial coherence
– All the triangles for a given object lie in about
the same position in space
Frustum Culling
• Why is spatial coherence important?
– If an object is out of the Frustum then so are all
its triangles
– If we could perform the Frustum check per
object then we could cull all the object’s
triangles at the same time with a single test
• Note that this is a software test, but only 1 is made
rather than 10s/100s/1000s of hardware tests
Frustum Culling
• How do we perform the test in software?
• First we need to find the planes of the
Frustum
– Recall that there are 6 planes (near, far, left,
right, top, bottom)
Frustum Culling
• If we can find the plane equation for each then we
can test to see which side of the plane a point lies
– Recall the plane eqn is: Ax + By + Cz + D = 0
• Where (A, B, C) is the normal of the plane
– If we have a point (x, y, z) and we plug it into this
equation, we get:
• Positive on the normal side of the plane (in front)
• Zero on the plane
• Negative on the non-normal side of the plane (behind)
– A point must be in front of all planes for it to be in the
Frustum
Frustum Culling
• So, how do you find the Frustum planes?
– After setting up the Camera, get the modelview matrix
and the projection matrix
• glGetFloatv( GL_MODELVIEW_MATRIX, mod );
• glGetFloatv( GL_PROJECTION_MATRIX, proj );
• Note that you need both because the modelview sets up
the camera position and orientation whereas the
projection sets up the shape of the Frustum
– Then multiply the two matrices together
• clip = (mod) x (proj)
Frustum Culling
• The A, B, C, and D values can then be extracted
from the “clip” matrix
– As an example the right plane is:
• A = clip[ 3] - clip[ 0]
B = clip[ 7] - clip[ 4]
C = clip[11] - clip[ 8]
D = clip[15] - clip[12]
• These plane values are then normalized so
+/- distances from the plane are correct
– Each component (A-D) is divided by sqrt(A2 + B2 +
C2)
Frustum Culling
• This sounds like a lot of work
– Not coding, but execution time “work”
• But keep a couple of things in mind:
– The Frustum plane equations only need to be
recalculated whenever the camera moves
– We are going to test an entire object rather than
a single point
• In particular, we will test sphere and cube objects
Frustum Culling
• Why just spheres and cubes?
– They make nice Bounding Volumes (BVs)
• A Bounding Volume is a simplified shape that
encloses an entire complex object
– If the Frustum test against the BV fails then the entire
complex object in the BV is not in the Frustum
• The advantage is that tests against simple shapes
are computational much easier to perform than
tests against complex shapes
Frustum Culling
• Sphere testing
– A sphere is defined by a center point and a radius
– Part of the sphere is still in front of the plane (and thus,
it can’t be culled) even if the center point is up to radius
units on the back side of the plane
– Recall that Ax + By + Cz + D is the distance from the
point (x, y, z) to the plane
– Thus, we keep the sphere if the sphere center plugged
into this equation produces values > -radius for all 6
plane equations
• The point test is the same, but with > 0
Frustum Culling
• Cube test
– A cube consists of 8 vertices
• Usually pass in as a center location and a size
– We could test to see if any of the 8 cube
vertices are within the Frustum (i.e. test each
vertex against all 6 planes)
• However, this leads to problems when the cube is
bigger than the Frustum – all 8 corners are outside
the Frustum, but the cube contents should be drawn
– False negatives – cube culled when it shouldn’t be
Frustum Culling
– Instead we reverse the order and test
all 8 points against a single plane
• If all are to the back side of the plane
then the cube is not visible and we can
cull it
• If any are in front, then we move onto
the next plane
• If it passes all 6 planes then we keep it
– However, this can lead to false
positives (see figure)
• We let it though and simply let the card
cull it on a triangle by triangle basis
Frustum Culling
• So what sort of speed up do you get?
– It depends on several factors:
• How big the Frustum is (especially the far plane)
• How many object are in the scene and how they are
distributed
• How complex each particular object is
– Check out gametutorials Frustum culling
tutorial
Octrees
• Octrees are a space partitioning data structure
– We saw Octrees before in the context of modeling
• In modeling the leaf nodes were either filled or empty
• In space partitioning the leaf nodes contain a list of triangles that are
in that sub-section of space
• The process:
– A single cube is placed around the entire world
– If the cube contains too many triangles, then the cube is subdivided into 8 smaller axis-aligned cubes
– Recurse until we hit the stopping condition
Octrees
Octrees
• When do you stop sub-dividing?
– When the cube contains less than a specified
number of triangles
– When the sub-division depth has reached a
maximum
– When the total number of cubes in my octree
has reached a maximum number
Octrees
• What happens to triangles that cross a cube
boundary?
– Place the object in the largest box that fully contains it
• Not efficient in that a small object can be placed in a large
bounding box if it lies directly on the boundary
• Implies that not all triangles are stored in the leaf nodes
– Split the triangle
• Increases the number of primitives in the scene
– Place the triangle in both sub-divisions
• As the triangles from each sub-division are sent to the renderer
we need to mark them so we don’t draw them twice
• Makes Octree editing, such as deletion, is more difficult
Octrees
• So how to Octrees help us?
– We can combine Octrees with Frustum culling
• The Octree cubes are Bounding Volumes for the triangles in
their region of space
• We only need to send the triangles contained in the cubes that
pass the Frustum test to the rendering pipeline
Octrees
• This leads to a process called Hierarchical
Frustum Culling
– Starting with the root node of the Octree, test
the Octree cube against the Frustum
• If not in frustum, then prune children
• If in frustum, then
– Send any contained triangles to renderer
– Recurse on each of the children
– Demo Octree2 and Octree3 from gametutorials
BSP Trees
• The idea is much the same as with Octrees
• However, the space is repeatedly split into 2
parts (hence the name “binary space
partition”) by a given splitting plane
– Octrees always use cubes
• There are two variants
– Axis-Aligned
– Polygon Aligned
Axis-Aligned BSP Trees
• Starting with a bounding box around the entire world, split
it into 2 parts with an xy, xz, or yz splitting plane
• Divide all the triangles between the two half-spaces
• Triangles that intersect the splitting plane can be dealt with
the same way we did with Octrees
• Recursively continue the
subdivision process until a
stopping criteria is reached
– Note that each successive
subdivision only splits its own
half-space
Axis-Aligned BSP Trees
• The interesting part is how to decide what
axis and where along that axis to split
• Suggestions on how?
– Hint: how does all this relate to what you
learned in cs255 about merge sort vs. quick sort
Polygon-Aligned BSP Trees
• In this version, a polygon is chosen as the divider
polygon
• This polygon is part of a plane and it is this plane
that is used as the splitting plane
Polygon-Aligned BSP Trees
• Just as with Axis-Aligned BSP trees the most important
choice is the selection of the polygon divider
• There are two main issues to consider:
– How balanced is the tree
– How many extra triangles did we have to make because of triangle
splitting
• It is a debatable question as to which is better:
– A balanced tree with lots of extra triangles
– A unbalanced tree with fewer extra triangles
• However most people pick the unbalanced tree with fewer
splits of triangles
Polygon-Aligned BSP Trees
• Generating an optimal BSP tree requires trying
every possible splitting combination
– The problem is that this is a O(n!) algorithm
• Recall that 20! = 2,432,902,008,176,640,000
• The next best option is to pick the best splitter at
each level
– Divide-and-Conquer with a Greedy selection of the
splitter at each level
– This is an O(n2) algorithm
– This is the option that most games use
• However, it is still pre-computed and saved in a file
BSP Trees
• Hierarchical Frustum Culling can be used fairly
easily with Axis-Aligned BSP trees
– Very similar to how it was used for Octrees since each
region is rectangular
• However, for Polygon-Aligned BSP trees testing
is more complex
– The region (leaf or interior) in question is convex, so
one could test all the corners of the region the same
way the 8 corners are tested for rectangular regions
– Or a bounding box of the region can be tested
• Quicker test, but more false positives (passed frustum test
when not actually in the fustum
BSP Trees
• However, Polygon-Aligned BSP trees have
their advantage in visibility
– Allow us to easily perform a visibility ordering
– Allow us to align splitting planes along walls
which may not be axis aligned
BSP Tree Visibility
• In complex indoor scenes, not only do we have lots of
geometry that is not in the camera’s frustum but we also
have lots of geometry that is hidden by walls
– That is, the frustum might extend through many walls
• Frustum culling would simply select all nodes it intersected and send
them to the renderer where a Z-buffer technique is used to handle the
occlusion in the normal way
• With Polygon-Aligned BSP Trees we can determine
visibility ordering w.r.t. a given camera position
– Works in linear time
– Works for any given camera position
BSP Tree Visibility
• In order to determine the visibility ordering
– First, build the BSP tree for the scene
• Usually done off-line
• For a static scene
– Dynamic scenes objects are usually handled separately
– Second, insert a viewpoint into the scene
– Third, perform a in-order traversal of the tree to
determine the polygon order
• Recall in-order traversals mean process child, process node,
process other child
BSP Tree Visibility
• What order should we process the children in?
• To obtain a back-to-front ordering
– Process the far-side child first
• Relative to the location of the camera
• To obtain a front-to-back ordering
– Process the near-side child first
BSP Tree Visibility
• Back-to-front: C1, B, D, A, C2, E
• Front-to-back: E, C2, A, D, B, C1
BSP Tree Visibility
• Back-to-front ordering leads to a version of the
“painters algorithm”
• This doesn’t suffer from the intersecting problem
seen previously in the painters algorithm
– Any polygons that would have intersected have now
been split into multiple pieces
• However, the painters algorithm is still really slow
– Pixels will be drawn only to be overdrawn by closer tris
• Often 10x overdraws (Michael Abrash of Quake frame)
– Overdraw amount varies so framerate is not stable
– Overdraw problems increase with scene complexity
BSP Tree Visibility
• Front-to-back has no overdraw problems
– Each pixel is drawn exactly once
• But we need to find a way to not overwrite a pixel
once it has been drawn
– Could use the normal Z-buffer technique
• However, this is overkill since we will never need to replace a
value in the Z-buffer with a newer one
– Could use a Stencil buffer technique
• Stencil filters pixels that have already been drawn
• As new pixels are drawn, the stencil buffer is updated
• This technique is usually faster than back-to-front
Stencil Buffer
• A Stencil Buffer is another buffer like the
color (frame) buffer and the Z-buffer
– On most hardware this is an 8 bit buffer
• The Stencil Test is a test that occurs
immediately before the Z-buffer test
– If the Stencil Test fails, the fragment is
discarded rather than passed on to the Z-buffer
• Fragment won’t show up in the color buffer either
Stencil Buffer
• The Stencil Test is a test of the value currently
stored in the Stencil Buffer
• The Stencil Function controls the type of Stencil
Test to perform:
GL_NEVER
GL_ALWAYS
GL_LESS
GL_EQUAL
GL_LEQUAL
GL_NOTEQUAL
GL_GREATER
GL_GEQUAL
Stencil Buffer
• There are 3 possible outcomes of the
combined Stencil and Z-buffer tests:
– Stencil Test fails
– Stencil Test passes, Depth Test fails
– Stencil Test passes, Depth Test passes
• The Stencil Operation lets you specify what
happens to the Stencil Buffer values in each
of these cases
Stencil Buffer
• The possible Stencil Operations are:
– GL_KEEP  old value kept
– GL_ZERO  value replaced by zero
– GL_REPLACE  value replaced by given reference
value (allows setting to a specific value)
– GL_INCR  value incremented by 1
– GL_DECR  value decremented by 1
– GL_INVERT  value bitwise inverted
Stencil Buffer
• The Stencil buffer/test are only performed if
– The Stencil buffer is enabled
– The underlying hardware/drivers support it
• The stencil buffer is usually used as a mask
– Example: creating a non-rectangular windshield
for a first-person driving game
• Store 1s in the stencil buffer pixels where you wish
the windshield to be and then set the stencil test to
render if not equal to 0
BSP Tree Visibility
• Note that the visibility ordering is not directly
related to either:
– Viewing direction
• Both A and B appear in the visibility ordering even though
camera is facing the other direction
– Distance of the polygons from the viewpoint
• In back-to-front ordering, B is before A in both figures even
though A is sometimes farther than B from the camera
BSP Tree Visibility
• The previous slides have assumed that the BSP
tree will continue dividing until every polygon is
used as a splitting plane
• However, in indoor scenes we often want to use
BSP trees aligned with the walls to provide
visibility on a room level and then let the normal
Z-buffer algorithm handle objects within the room
– That is, a hybrid visibility scheme
Portals
• World is divided up into Cells and Portals
• Cells are convex regions
– Could be BSP tree leaves
• Portals are doorways/windows between the
Cells
Portals
• Basic algorithm:
– Start in the cell that contains the camera
• Equating Cells to BSP tree leaves helps here
– Perform a frustum check against all objects in
the current cell to determine what is visible
– If any portals are visible (in frustum), add the
connecting cell to the list of cells to process
– Continue until all visible cells have been
rendered
Portals
• An optimization of the previous algorithm
is frustum reduction
• When a portal is visible, it means the cell it leads to
is visible, but only partially, because the camera can
only see this cell through the portal
• Thus, the frustum can be reduced in the new cell
– Cuts down the number of objects in the cell that are sent to
the renderer
– And cuts down on the number of “visible” portals seen
which reduces the amount of recursion
Portals
– The reduced frustum is created by planes from the
camera through the portal edges
• For portals with vertical walls, it turns into a 2D issue of
casting rays through points
PVS
• Techniques that were tried and rejected for largescale culling in Quake
•
•
•
•
•
•
•
Pure Z-buffering
Painters algorithm
Beam tree
Subdividing raycasting
Edge or Span sorting
Portals
Direct visibility extraction from BSP
• Instead Quake went with PVS
PVS
• Potentially Visible Sets (PVS) involve precomputing visibility information from the BSP tree
– That is, first divide up the world using a BSP tree, storing
all the polygon faces in the leaf nodes
– Next, perform off-line visibility checks to determine
which leaf nodes are visible from which other leaf nodes
 creating a 2D visibility matrix
• Visible from anywhere in the source leaf node
– The 2D visibility matrix information, called a Potentially
Visible Set (PVS) is then used to cull large parts of the
BSP tree away when rendering
PVS
• The PVS is used by first descending the BSP tree
with the camera coordinates to determine in which
leaf node the camera resides
• Then the PVS is consulted to determine other leaf
nodes that may also need to be rendered
• This can then be combined with frustum culling to
test the bounding boxes of these leaf nodes with
the frustum
• And finally, the polygons from the surviving leaf
nodes are sent to the renderer
PVS
• Creating the PVS is a difficult task
– Very time-consuming
– Various way to create it, but often portals are
used to create view-independent visibility maps
• Note that the previously discussed portal
information was view-dependent (known camera
position/orientation)
• If we are going to pre-compute visibility, we need
view-independent information
PVS
• The entire cell of the potential camera (blue) is used
as the starting cell
• Lines (of sight) are constructed between portal edges
to determine which cells are visible
– There are more examples
on p.298-301 of Games
book
PVS
• The PVS can be built from these portal
constructed visibility maps by using every
leaf node as a potential starting position and
recording all the other leaf nodes that are
visible from the starting leaf
Collision Detection
• When objects are moving we need to
determine if, when, and where they collide
• Each question gets harder so solve (both
computationally and theoretically)
– To answer when one also needs to know if
– To answer where one also needs to know when
• There are methods that solve all at once, but you
can’t solve each successive one without the prior
Collision Detection
• The brute force solution is to test every
triangle of each object against every triangle
of every other object
• The problem: O(n2 m2)
• n is the number of objects
• m is the number of triangles per object
– So it becomes infeasible as scene complexity or
object complexity increases
Collision Detection
• The solution is two-fold:
– Space Partitioning
• Reduces the n2 term
– Bounding Volumes
• Reduces the m2 term
Collision Detection
• We have already seen several Space
Partitioning schemes
– Octrees
– BSP Trees
• An object can only collide with another
object that is in the same part of space
– This is sometimes called a “collision group”
Collision Detection
• The problem with using space partitioning schemes is that
one needs to keep the partitions updated as objects move
through space
– It certainly can be done, it just takes time
• Loose Octrees hold an advantage here because of their fast
insert/delete times
• Often, spatial coherence can be used to quickly determine
the object’s new node
• For collision detection of a moving object with only static
objects (architectural) the problem is easier since one can
avoid updating the tree
– Keep only the static objects in the tree and dynamically determine
the moving object’s node
Collision Detection
• Bounding volumes simplify objects
– An object containing 10K triangles can be represented
with a single sphere
– Makes collision tests much simpler
• However, false positives can occur
– The bounding volumes collide but the objects they
contain do not
• The standard solution is to use a two-phase
collision detection scheme
Collision Detection
• The two-phase scheme:
– Broad phase:
• Bounding volumes are used to cull away pairs of moving
objects that cannot possibly collide
• Leaves some false positives
– Narrow phase:
• An exact collision detection method on the pairs that survive
the culling process
• The particular choice of algorithms for each phase
can be made independently
Collision Detection
• Common Bounding Volumes:
–
–
–
–
Sphere
Axis-Aligned Bounding Box (AABB)
Oriented Bounding Box (OBB)
Discrete Orientation Polytope (k-dop)
• Pairs of parallel planes
Collision Detection
• In deciding which is best, there is a tradeoff
between bounding efficiency and intersection test
speed
• Bounding efficiency is how closely the volume
bounds the actual object
– Higher bounding efficiency reduces false positives
• Intersection test speed is controlled by the
complexity of the shape
– Sphere/Sphere intersection is much faster than
kDOP/kDOP intersection
Collision Detection
• One also needs to worry about how to
update these volumes for moving objects
– Obviously there is the issue of translation
which is easily handled by translating the BV
– However, the problems can occur when the
objects can rotate
• This is a major problem for AABBs which can’t be
rotated and need to be recalculated
• But not a issue at all for Spheres
Collision Detection
• Hierarchies of bounding volumes can be
used instead of a single BV
• Collision detection starts at the top of the
hierarchy and proceeds downward to the
leaf nodes
• The goal is to reduce the number of
triangles involved in the narrow phase
Collision Detection
• There are also many other BVs in use
– Capsules, Lozenges, Cylinders, Ellipsoids, etc.
• It is somewhat surprising that some simple
shapes such as cylinders should be avoided
– “intersection testing against a moving cylinder
is extremely complicated and somewhat
expensive process”
Collision Detection
• When implementing collision detection, one needs
to know the potential shapes involved
• There are sets of (sometimes complex) math
equations for intersecting almost any shape against
another
– Eberly’s 3D Game Engine Design is full of them
• You probably want to keep the shapes in your
system to a minimum so you don’t end up with an
explosion of testing code
Collision Detection
• Some rules of thumb when implementing
intersection routines (from real-time rendering):
– Perform computations that might trivially reject or
accept various types of intersections early
– Exploit results from previous tests
– Postpone expensive calculations (esp. trig functions)
until they are truly needed
– If a single object is being tested against many other
objects look for precalculations that can be done once
– Watch out for floating-point precision problems
Collision Detection
• There are also difficulties associated with
temporal sampling:
– If the velocity of an objects is high with respect
to the sampling rate for collision detection then
the objects can pass through each other without
being detected
– In general, collision will not be detected until
after the objects has penetrated each other
Collision Detection
• To help address the problem of objects passing
through each other undetected one can use
adaptive sampling rates
– As objects get closer to each other sample more
frequently
– How do we determine if objects are getting closer?
• Most intersection tests can be made to return more than a
Boolean, such as a distance
• Use a larger bounding volume to trigger the need to sample
more often
Collision Detection
• What do we do when detection occurs?
– This is often called the “collision response”
– Recall that most detection occurs after penetration
• Some systems simply need to back the objects away from
each other in the direction opposite their direction of travel
– Used to prevent objects from going through walls, etc.
– This happens before the scene is rendered so the collision is never
seen by the user:
•
•
•
•
•
•
Scene rendered
Objects moved
Collision detection occurs
Collision response activated
Scene rendered
…
Collision Detection
• Other systems that perform more realistic dynamic
simulations need to bounce the objects off of each other in
a realistic way
• They need to know the exact time of collision
• To achieve this we need to back time up to the point of
impact
– Most algorithms produce an estimate of the impact time, set the
objects to that time, recalculate the collision test, and recurse if
estimate was not accurate enough
• The main problem with this is time cost, specifically nonuniform time cost
Download