Space Partitioning Computer Graphics What is Space Partitioning? • A division of 3D space into distinct regions • Specifically, a tree data structure is used to divide up the world in a hierarchical manner – The tree leaf nodes usually contain pieces (triangles) of the world – Each leaf holds onto the polygons that are in its sub-piece of the world Why Perform Space Partitioning? • Speed – For example, an accurate model of a Boeing-777 contains over 500M triangles • You do not want to send 500M tri down the rendering pipeline – We have seen various mapping techniques to make the model look good with less triangles – But we can also cut down on the number of triangles by only sending triangles down the pipeline that are in the field of view of the camera • By dividing the world into pieces we can cull/clip on large scale rather than per triangle as the rendering pipeline does Types of Space Partitioning • The two main types we will explore are: – Octrees – Binary Space Partition (BSP) trees • However, in the process we will also cover: – – – – Frustum culling Bounding Volumes (BVs) Potentially Visible Sets (PVS) Scene Graphs Frustum Culling • In the classic rendering pipeline front-facing triangles are clipped against the Frustum – All 3 vertices outside the volume implies the triangle is culled (not sent down the pipeline) • The problem with this technique is that it is triangle-based and doesn’t take into account spatial coherence – All the triangles for a given object lie in about the same position in space Frustum Culling • Why is spatial coherence important? – If an object is out of the Frustum then so are all its triangles – If we could perform the Frustum check per object then we could cull all the object’s triangles at the same time with a single test • Note that this is a software test, but only 1 is made rather than 10s/100s/1000s of hardware tests Frustum Culling • How do we perform the test in software? • First we need to find the planes of the Frustum – Recall that there are 6 planes (near, far, left, right, top, bottom) Frustum Culling • If we can find the plane equation for each then we can test to see which side of the plane a point lies – Recall the plane eqn is: Ax + By + Cz + D = 0 • Where (A, B, C) is the normal of the plane – If we have a point (x, y, z) and we plug it into this equation, we get: • Positive on the normal side of the plane (in front) • Zero on the plane • Negative on the non-normal side of the plane (behind) – A point must be in front of all planes for it to be in the Frustum Frustum Culling • So, how do you find the Frustum planes? – After setting up the Camera, get the modelview matrix and the projection matrix • glGetFloatv( GL_MODELVIEW_MATRIX, mod ); • glGetFloatv( GL_PROJECTION_MATRIX, proj ); • Note that you need both because the modelview sets up the camera position and orientation whereas the projection sets up the shape of the Frustum – Then multiply the two matrices together • clip = (mod) x (proj) Frustum Culling • The A, B, C, and D values can then be extracted from the “clip” matrix – As an example the right plane is: • A = clip[ 3] - clip[ 0] B = clip[ 7] - clip[ 4] C = clip[11] - clip[ 8] D = clip[15] - clip[12] • These plane values are then normalized so +/- distances from the plane are correct – Each component (A-D) is divided by sqrt(A2 + B2 + C2) Frustum Culling • This sounds like a lot of work – Not coding, but execution time “work” • But keep a couple of things in mind: – The Frustum plane equations only need to be recalculated whenever the camera moves – We are going to test an entire object rather than a single point • In particular, we will test sphere and cube objects Frustum Culling • Why just spheres and cubes? – They make nice Bounding Volumes (BVs) • A Bounding Volume is a simplified shape that encloses an entire complex object – If the Frustum test against the BV fails then the entire complex object in the BV is not in the Frustum • The advantage is that tests against simple shapes are computational much easier to perform than tests against complex shapes Frustum Culling • Sphere testing – A sphere is defined by a center point and a radius – Part of the sphere is still in front of the plane (and thus, it can’t be culled) even if the center point is up to radius units on the back side of the plane – Recall that Ax + By + Cz + D is the distance from the point (x, y, z) to the plane – Thus, we keep the sphere if the sphere center plugged into this equation produces values > -radius for all 6 plane equations • The point test is the same, but with > 0 Frustum Culling • Cube test – A cube consists of 8 vertices • Usually pass in as a center location and a size – We could test to see if any of the 8 cube vertices are within the Frustum (i.e. test each vertex against all 6 planes) • However, this leads to problems when the cube is bigger than the Frustum – all 8 corners are outside the Frustum, but the cube contents should be drawn – False negatives – cube culled when it shouldn’t be Frustum Culling – Instead we reverse the order and test all 8 points against a single plane • If all are to the back side of the plane then the cube is not visible and we can cull it • If any are in front, then we move onto the next plane • If it passes all 6 planes then we keep it – However, this can lead to false positives (see figure) • We let it though and simply let the card cull it on a triangle by triangle basis Frustum Culling • So what sort of speed up do you get? – It depends on several factors: • How big the Frustum is (especially the far plane) • How many object are in the scene and how they are distributed • How complex each particular object is – Check out gametutorials Frustum culling tutorial Octrees • Octrees are a space partitioning data structure – We saw Octrees before in the context of modeling • In modeling the leaf nodes were either filled or empty • In space partitioning the leaf nodes contain a list of triangles that are in that sub-section of space • The process: – A single cube is placed around the entire world – If the cube contains too many triangles, then the cube is subdivided into 8 smaller axis-aligned cubes – Recurse until we hit the stopping condition Octrees Octrees • When do you stop sub-dividing? – When the cube contains less than a specified number of triangles – When the sub-division depth has reached a maximum – When the total number of cubes in my octree has reached a maximum number Octrees • What happens to triangles that cross a cube boundary? – Place the object in the largest box that fully contains it • Not efficient in that a small object can be placed in a large bounding box if it lies directly on the boundary • Implies that not all triangles are stored in the leaf nodes – Split the triangle • Increases the number of primitives in the scene – Place the triangle in both sub-divisions • As the triangles from each sub-division are sent to the renderer we need to mark them so we don’t draw them twice • Makes Octree editing, such as deletion, is more difficult Octrees • So how to Octrees help us? – We can combine Octrees with Frustum culling • The Octree cubes are Bounding Volumes for the triangles in their region of space • We only need to send the triangles contained in the cubes that pass the Frustum test to the rendering pipeline Octrees • This leads to a process called Hierarchical Frustum Culling – Starting with the root node of the Octree, test the Octree cube against the Frustum • If not in frustum, then prune children • If in frustum, then – Send any contained triangles to renderer – Recurse on each of the children – Demo Octree2 and Octree3 from gametutorials BSP Trees • The idea is much the same as with Octrees • However, the space is repeatedly split into 2 parts (hence the name “binary space partition”) by a given splitting plane – Octrees always use cubes • There are two variants – Axis-Aligned – Polygon Aligned Axis-Aligned BSP Trees • Starting with a bounding box around the entire world, split it into 2 parts with an xy, xz, or yz splitting plane • Divide all the triangles between the two half-spaces • Triangles that intersect the splitting plane can be dealt with the same way we did with Octrees • Recursively continue the subdivision process until a stopping criteria is reached – Note that each successive subdivision only splits its own half-space Axis-Aligned BSP Trees • The interesting part is how to decide what axis and where along that axis to split • Suggestions on how? – Hint: how does all this relate to what you learned in cs255 about merge sort vs. quick sort Polygon-Aligned BSP Trees • In this version, a polygon is chosen as the divider polygon • This polygon is part of a plane and it is this plane that is used as the splitting plane Polygon-Aligned BSP Trees • Just as with Axis-Aligned BSP trees the most important choice is the selection of the polygon divider • There are two main issues to consider: – How balanced is the tree – How many extra triangles did we have to make because of triangle splitting • It is a debatable question as to which is better: – A balanced tree with lots of extra triangles – A unbalanced tree with fewer extra triangles • However most people pick the unbalanced tree with fewer splits of triangles Polygon-Aligned BSP Trees • Generating an optimal BSP tree requires trying every possible splitting combination – The problem is that this is a O(n!) algorithm • Recall that 20! = 2,432,902,008,176,640,000 • The next best option is to pick the best splitter at each level – Divide-and-Conquer with a Greedy selection of the splitter at each level – This is an O(n2) algorithm – This is the option that most games use • However, it is still pre-computed and saved in a file BSP Trees • Hierarchical Frustum Culling can be used fairly easily with Axis-Aligned BSP trees – Very similar to how it was used for Octrees since each region is rectangular • However, for Polygon-Aligned BSP trees testing is more complex – The region (leaf or interior) in question is convex, so one could test all the corners of the region the same way the 8 corners are tested for rectangular regions – Or a bounding box of the region can be tested • Quicker test, but more false positives (passed frustum test when not actually in the fustum BSP Trees • However, Polygon-Aligned BSP trees have their advantage in visibility – Allow us to easily perform a visibility ordering – Allow us to align splitting planes along walls which may not be axis aligned BSP Tree Visibility • In complex indoor scenes, not only do we have lots of geometry that is not in the camera’s frustum but we also have lots of geometry that is hidden by walls – That is, the frustum might extend through many walls • Frustum culling would simply select all nodes it intersected and send them to the renderer where a Z-buffer technique is used to handle the occlusion in the normal way • With Polygon-Aligned BSP Trees we can determine visibility ordering w.r.t. a given camera position – Works in linear time – Works for any given camera position BSP Tree Visibility • In order to determine the visibility ordering – First, build the BSP tree for the scene • Usually done off-line • For a static scene – Dynamic scenes objects are usually handled separately – Second, insert a viewpoint into the scene – Third, perform a in-order traversal of the tree to determine the polygon order • Recall in-order traversals mean process child, process node, process other child BSP Tree Visibility • What order should we process the children in? • To obtain a back-to-front ordering – Process the far-side child first • Relative to the location of the camera • To obtain a front-to-back ordering – Process the near-side child first BSP Tree Visibility • Back-to-front: C1, B, D, A, C2, E • Front-to-back: E, C2, A, D, B, C1 BSP Tree Visibility • Back-to-front ordering leads to a version of the “painters algorithm” • This doesn’t suffer from the intersecting problem seen previously in the painters algorithm – Any polygons that would have intersected have now been split into multiple pieces • However, the painters algorithm is still really slow – Pixels will be drawn only to be overdrawn by closer tris • Often 10x overdraws (Michael Abrash of Quake frame) – Overdraw amount varies so framerate is not stable – Overdraw problems increase with scene complexity BSP Tree Visibility • Front-to-back has no overdraw problems – Each pixel is drawn exactly once • But we need to find a way to not overwrite a pixel once it has been drawn – Could use the normal Z-buffer technique • However, this is overkill since we will never need to replace a value in the Z-buffer with a newer one – Could use a Stencil buffer technique • Stencil filters pixels that have already been drawn • As new pixels are drawn, the stencil buffer is updated • This technique is usually faster than back-to-front Stencil Buffer • A Stencil Buffer is another buffer like the color (frame) buffer and the Z-buffer – On most hardware this is an 8 bit buffer • The Stencil Test is a test that occurs immediately before the Z-buffer test – If the Stencil Test fails, the fragment is discarded rather than passed on to the Z-buffer • Fragment won’t show up in the color buffer either Stencil Buffer • The Stencil Test is a test of the value currently stored in the Stencil Buffer • The Stencil Function controls the type of Stencil Test to perform: GL_NEVER GL_ALWAYS GL_LESS GL_EQUAL GL_LEQUAL GL_NOTEQUAL GL_GREATER GL_GEQUAL Stencil Buffer • There are 3 possible outcomes of the combined Stencil and Z-buffer tests: – Stencil Test fails – Stencil Test passes, Depth Test fails – Stencil Test passes, Depth Test passes • The Stencil Operation lets you specify what happens to the Stencil Buffer values in each of these cases Stencil Buffer • The possible Stencil Operations are: – GL_KEEP old value kept – GL_ZERO value replaced by zero – GL_REPLACE value replaced by given reference value (allows setting to a specific value) – GL_INCR value incremented by 1 – GL_DECR value decremented by 1 – GL_INVERT value bitwise inverted Stencil Buffer • The Stencil buffer/test are only performed if – The Stencil buffer is enabled – The underlying hardware/drivers support it • The stencil buffer is usually used as a mask – Example: creating a non-rectangular windshield for a first-person driving game • Store 1s in the stencil buffer pixels where you wish the windshield to be and then set the stencil test to render if not equal to 0 BSP Tree Visibility • Note that the visibility ordering is not directly related to either: – Viewing direction • Both A and B appear in the visibility ordering even though camera is facing the other direction – Distance of the polygons from the viewpoint • In back-to-front ordering, B is before A in both figures even though A is sometimes farther than B from the camera BSP Tree Visibility • The previous slides have assumed that the BSP tree will continue dividing until every polygon is used as a splitting plane • However, in indoor scenes we often want to use BSP trees aligned with the walls to provide visibility on a room level and then let the normal Z-buffer algorithm handle objects within the room – That is, a hybrid visibility scheme Portals • World is divided up into Cells and Portals • Cells are convex regions – Could be BSP tree leaves • Portals are doorways/windows between the Cells Portals • Basic algorithm: – Start in the cell that contains the camera • Equating Cells to BSP tree leaves helps here – Perform a frustum check against all objects in the current cell to determine what is visible – If any portals are visible (in frustum), add the connecting cell to the list of cells to process – Continue until all visible cells have been rendered Portals • An optimization of the previous algorithm is frustum reduction • When a portal is visible, it means the cell it leads to is visible, but only partially, because the camera can only see this cell through the portal • Thus, the frustum can be reduced in the new cell – Cuts down the number of objects in the cell that are sent to the renderer – And cuts down on the number of “visible” portals seen which reduces the amount of recursion Portals – The reduced frustum is created by planes from the camera through the portal edges • For portals with vertical walls, it turns into a 2D issue of casting rays through points PVS • Techniques that were tried and rejected for largescale culling in Quake • • • • • • • Pure Z-buffering Painters algorithm Beam tree Subdividing raycasting Edge or Span sorting Portals Direct visibility extraction from BSP • Instead Quake went with PVS PVS • Potentially Visible Sets (PVS) involve precomputing visibility information from the BSP tree – That is, first divide up the world using a BSP tree, storing all the polygon faces in the leaf nodes – Next, perform off-line visibility checks to determine which leaf nodes are visible from which other leaf nodes creating a 2D visibility matrix • Visible from anywhere in the source leaf node – The 2D visibility matrix information, called a Potentially Visible Set (PVS) is then used to cull large parts of the BSP tree away when rendering PVS • The PVS is used by first descending the BSP tree with the camera coordinates to determine in which leaf node the camera resides • Then the PVS is consulted to determine other leaf nodes that may also need to be rendered • This can then be combined with frustum culling to test the bounding boxes of these leaf nodes with the frustum • And finally, the polygons from the surviving leaf nodes are sent to the renderer PVS • Creating the PVS is a difficult task – Very time-consuming – Various way to create it, but often portals are used to create view-independent visibility maps • Note that the previously discussed portal information was view-dependent (known camera position/orientation) • If we are going to pre-compute visibility, we need view-independent information PVS • The entire cell of the potential camera (blue) is used as the starting cell • Lines (of sight) are constructed between portal edges to determine which cells are visible – There are more examples on p.298-301 of Games book PVS • The PVS can be built from these portal constructed visibility maps by using every leaf node as a potential starting position and recording all the other leaf nodes that are visible from the starting leaf Collision Detection • When objects are moving we need to determine if, when, and where they collide • Each question gets harder so solve (both computationally and theoretically) – To answer when one also needs to know if – To answer where one also needs to know when • There are methods that solve all at once, but you can’t solve each successive one without the prior Collision Detection • The brute force solution is to test every triangle of each object against every triangle of every other object • The problem: O(n2 m2) • n is the number of objects • m is the number of triangles per object – So it becomes infeasible as scene complexity or object complexity increases Collision Detection • The solution is two-fold: – Space Partitioning • Reduces the n2 term – Bounding Volumes • Reduces the m2 term Collision Detection • We have already seen several Space Partitioning schemes – Octrees – BSP Trees • An object can only collide with another object that is in the same part of space – This is sometimes called a “collision group” Collision Detection • The problem with using space partitioning schemes is that one needs to keep the partitions updated as objects move through space – It certainly can be done, it just takes time • Loose Octrees hold an advantage here because of their fast insert/delete times • Often, spatial coherence can be used to quickly determine the object’s new node • For collision detection of a moving object with only static objects (architectural) the problem is easier since one can avoid updating the tree – Keep only the static objects in the tree and dynamically determine the moving object’s node Collision Detection • Bounding volumes simplify objects – An object containing 10K triangles can be represented with a single sphere – Makes collision tests much simpler • However, false positives can occur – The bounding volumes collide but the objects they contain do not • The standard solution is to use a two-phase collision detection scheme Collision Detection • The two-phase scheme: – Broad phase: • Bounding volumes are used to cull away pairs of moving objects that cannot possibly collide • Leaves some false positives – Narrow phase: • An exact collision detection method on the pairs that survive the culling process • The particular choice of algorithms for each phase can be made independently Collision Detection • Common Bounding Volumes: – – – – Sphere Axis-Aligned Bounding Box (AABB) Oriented Bounding Box (OBB) Discrete Orientation Polytope (k-dop) • Pairs of parallel planes Collision Detection • In deciding which is best, there is a tradeoff between bounding efficiency and intersection test speed • Bounding efficiency is how closely the volume bounds the actual object – Higher bounding efficiency reduces false positives • Intersection test speed is controlled by the complexity of the shape – Sphere/Sphere intersection is much faster than kDOP/kDOP intersection Collision Detection • One also needs to worry about how to update these volumes for moving objects – Obviously there is the issue of translation which is easily handled by translating the BV – However, the problems can occur when the objects can rotate • This is a major problem for AABBs which can’t be rotated and need to be recalculated • But not a issue at all for Spheres Collision Detection • Hierarchies of bounding volumes can be used instead of a single BV • Collision detection starts at the top of the hierarchy and proceeds downward to the leaf nodes • The goal is to reduce the number of triangles involved in the narrow phase Collision Detection • There are also many other BVs in use – Capsules, Lozenges, Cylinders, Ellipsoids, etc. • It is somewhat surprising that some simple shapes such as cylinders should be avoided – “intersection testing against a moving cylinder is extremely complicated and somewhat expensive process” Collision Detection • When implementing collision detection, one needs to know the potential shapes involved • There are sets of (sometimes complex) math equations for intersecting almost any shape against another – Eberly’s 3D Game Engine Design is full of them • You probably want to keep the shapes in your system to a minimum so you don’t end up with an explosion of testing code Collision Detection • Some rules of thumb when implementing intersection routines (from real-time rendering): – Perform computations that might trivially reject or accept various types of intersections early – Exploit results from previous tests – Postpone expensive calculations (esp. trig functions) until they are truly needed – If a single object is being tested against many other objects look for precalculations that can be done once – Watch out for floating-point precision problems Collision Detection • There are also difficulties associated with temporal sampling: – If the velocity of an objects is high with respect to the sampling rate for collision detection then the objects can pass through each other without being detected – In general, collision will not be detected until after the objects has penetrated each other Collision Detection • To help address the problem of objects passing through each other undetected one can use adaptive sampling rates – As objects get closer to each other sample more frequently – How do we determine if objects are getting closer? • Most intersection tests can be made to return more than a Boolean, such as a distance • Use a larger bounding volume to trigger the need to sample more often Collision Detection • What do we do when detection occurs? – This is often called the “collision response” – Recall that most detection occurs after penetration • Some systems simply need to back the objects away from each other in the direction opposite their direction of travel – Used to prevent objects from going through walls, etc. – This happens before the scene is rendered so the collision is never seen by the user: • • • • • • Scene rendered Objects moved Collision detection occurs Collision response activated Scene rendered … Collision Detection • Other systems that perform more realistic dynamic simulations need to bounce the objects off of each other in a realistic way • They need to know the exact time of collision • To achieve this we need to back time up to the point of impact – Most algorithms produce an estimate of the impact time, set the objects to that time, recalculate the collision test, and recurse if estimate was not accurate enough • The main problem with this is time cost, specifically nonuniform time cost