Heading later

advertisement
A Hybrid Computationally Efficient Parallel Algorithm for Best Visual
Quality 3D Real-time Graphics
Research Area: Rewriting Algorithms to help Parallel Programming
Authors: Amrit Asrani
Atheendra P Tarun
Athresh R Shigaval
Faculty Mentor: Mr.Sudhir Shenai
Name of the Institution: Global Academy Of Technology, Bangalore.
Abstract:
In the domain of computer graphics, real time visualization is achieved by two
prominent techniques: Z-buffer algorithm and the ray tracing algorithm. Both have its
inherent advantages and limitations. The Ray Tracing technique is capable of producing a
very high degree of photorealism, usually higher than that of typical scanline rendering
methods but at a greater computational cost. On the other hand Z-buffer is computationally
fast but not the best in visual quality. This paper proposes a new hybrid parallel algorithm
which exploits the speed of Z-buffer and the visual quality feature of Ray Tracing. This
algorithm takes the Merged kD- trees proposed in the RAZOR architecture of Copernicus
system and merges with the Z-buffer parallel algorithm which uses a hypercube topology.
Multithreading is introduced in the construction stage of Merged kD-trees in computing
dynamic scenes which serves as an input to the Z-buffer which itself is parallel and the
computation is inherently fast. Thus the parallelism is introduced at various stages of the
algorithm which is a linchpin for the high quality 3D Real-time Graphics.
The new hybrid algorithm is presumed to be efficient as the experimental analysis of
the Z-buffer [1] and Ray Tracing [3] independently are proved to be efficient on their own
grounds, viz., in computation and visual quality respectively.
Background:
The Z Buffer algorithm is used to ensure that perspective works the same way in the virtual
world as in the real world. It is a type of Visual Surface Determination (VSD)
algorithm[9]. Z buffering works by testing pixel depth and comparing the current position (z
coordinate) with stored data in a buffer (called a z buffer) that holds information about each
pixel’s last position. The pixel which is closer to the viewer is the one that will be displayed.
This can be seen when two squares overlap. The square on the top is visible but not the one
below it. Z buffer algorithm is used to virtualize such states.
The generic sequential approach to Z buffer algorithm is[1] :
For All the Objects in the scene
Project the Object in the image coordinate system
For every scanline of that Object
For all the pixel in a scanline
If Z coordinate of pixel< Zbuffer[pixel]
Write [pixel]
Zbuffer[pixel] = Z coordinate of pixel
End if
End for
End For
End For
Since the image contains several objects, the first step in the sequential algorithm is to
project each object onto the coordinate system. Objects are, then scanned row-by-row.
Considering each pixel in the scanline, its Z coordinate is compared with the value in the zbuffer, which is previously initialized to infinity. If the z-coordinate is found to be lesser than
the value in the buffer, this new pixel is superimposed on the old pixel. This coordinate is
copied onto the Z buffer. Consequently, the pixels closer to the observer are displayed.
The Parallel Approach:
Improvisation of the sequential algorithm is achieved through the parallel approach.
To substantiate this point we take the example of two overlapping squares. Consider two
squares, one overlapping the other partially, as shown. Applying the z buffer algorithm for
coloring of pixels, the blue square is obscured partially by the yellow square, which is in the
foreground.
fig. 1
The parallel algorithm for this problem is[1]:
ParallelZbuffer()
Begin
Scatter(Vertices)
Scatter(Squares)
>> For all picture to compute Do
>>Project vertices from object to screen coordinate system
MultiBroadcast(Projected Vertices)
>>LocalLoad Estimation(Locals squares)
GlobalLoad = MultiReduce(LocalLoad)
MultiScatter(squares)
>>Sequential Zbuffer
Output the picture
EndDo
End
The parallel parts of this algorithm have been marked with >>.
In order to optimize the memory and computation requirement, our scene is represented by
a two-level data structure: a set of vertices and a set of squares. A
vertex is a set of 6 real numbers which define a point in a coordinate system and a normal
for this point. A square is a set of 4 vertices' indices (4 integers).
We describe the different parts of this algorithm:
Scatter(Vertices):All the vertices of the scene are equally distributed on the parallel
computer. The vertices come from a disk or from a previous
computation on the parallel computer.
Scatter(Squares):We equally distribute squares on the parallel computer.
Note that the squares, of a given processor, can make reference to vertices that might not
be present in the local memory of that processor.
Project vertices from object to screen coordinate system: The projections are done in
parallel. For each vertex we have to do a matrix vector multiplication. Furthermore, we use
the normal to shade the vertices and assign it a RGB color using the Gouraud model.
MultiBroadcast(Projected Vertices): After this step, each processor knows all the
projected vertices of the scene even if it doesn't use them.
LocalLoad = Estimation(Locals Squares): Each processor computes in parallel an
estimation of the load due to its own squares. We approximate the load associated with
each row of the picture, with the number of squares intersecting that row.
GlobalLoad = MultiReduce(LocalLoad): This global load allows to compute for each
processor which part of the picture to treat in order to have a balanced workload.
MultiScatter(Squares): Given image partition, we can compute the squares required by
each processor.
Sequential Zbuffer: We compute in parallel a sequential z-buffer for the part of the image
owned by each processor.
Write the picture: When all the sequential z-buffers are performed, we transfer the image
to an output device.
Ray Tracing:
Ray Tracing is a technique for generating an image by tracing the path of light through
pixels in an image place. Ray tracing gives the best visual quality but is not fast enough to
support real time computation of graphics. There are several possibilities how to make a
ray-tracing or ray-casting faster. One class of approach employs data structures for
speeding up the search for a closest intersection on a ray. Data structures which support
efficient geometric search allow us to look at only a small percentage of the scene to
determine the closest intersection. Octrees, kD trees, and nested bounding volumes are
examples of explicitly hierarchical search structures of this type. A kD Tree (k-Dimensional
Tree) is a space-partitioning data structure for organizing points in a k-dimensional
space[10].
fig.2 - kD Tree Structure
buildkd()[4]:
1) Create a root node for the kD-tree with the scene bounding box and the scene graph root
node.
2) Set the current node to be the root.
3) Set the current discrete LOD level to be the coarsest supported level.
4) Subdivide the geometry at the current node until it satisfies the current discrete
LOD criteria.
5) Build out the kD-tree from this node until the tree termination criteria are satisfied.
6) Retain the current geometry (these nodes are effectively leaves for the current discrete
LOD level).
7) Set the current discrete LOD level to the next finer level.
8) Go to step 4.
At the beginning of every frame, kD-tree construction is initialized with a single root kD-tree
node containing the bounding box of the entire scene and a single pointer to the root of the
scene graph. All further kD-tree building is triggered by traversal operations during ray
tracing
The Problem Statement:
The faster Z-Buffer algorithm is not well suited for higher level visibility/occlusion culling. It
is highly resolution dependent and prone to accuracy problems. On the other hand ray
tracing algorithm provides dynamic scenes, high image quality and execution on
programmable multicore architectures[3]. But it’s considerably slow, which leads to the
requirement of a new algorithm which combines the advantages of both these methods.
Our Hybrid Methodology:
Our hybrid approach, imbibes the advantages of both the ray tracing algorithm and the
conventional Z-buffer algorithm, in which we provide the input to the z buffer method of
computations using the kD tree method .
HybridZbuffer()
Begin
>> If (frame received) do
BuildkD()
>>
For all picture to compute Do
>>
Project vertices from object to screen coordinate system
MultiBroadcast(Projected Vertices)
>>
LocalLoad Estimation(Locals squares)
GlobalLoad = MultiReduce(LocalLoad)
MultiScatter(squares)
>>
Sequential Zbuffer
Output the picture
EndDo
End
Its flow chart is:
In this algorithm we first check if a frame is received, if so the buildkD function is called
where the kD tree is created. Since creating kD trees is a time consuming process it is more
effective when parallelized. This serves as the input to the z-buffer algorithm. In each
thread the calculations of normal z buffer algorithm is carried out as discussed earlier. Thus
we obtain a new improvised hybrid algorithm which has the advantages of both the z buffer
algorithm as well as the ray tracing algorithm.
Key Results:
Considering the generation of the image of a teapot
fig.3
According to the hypercube topology[1] proposed by S. Miguet and J. Li based on a ring of
processors, the variation of execution time with the increasing number of processors, is
shown in the graph below[]. The times are given for two sizes of pictures: 256 by 256 pixels
and 512 by 512 pixels. It can be clearly observed that there’s a sharp decrease in the
execution time, when we switch from single core to multiple cores. But further increase in
the number of cores does not yield much improvement over its predecessors.
Discussion:
Conventional Z-buffer used for 3D graphics does not provide complex illumination effects
like soft-shadows, reflections and diffuse lighting interactions. Though the Copernicus
system, which utilizes the ray tracing technique, has been considered as its substitute
because of its features like dynamic scenes, high image quality and execution on
programmable multicore architecture, it is considerably slow compared to the Z-buffer. Our
algorithm is designed to contain the advantages of both the above mentioned algorithms
and is presumed to be more competent for computation of set of images, when the
polygons are already present in the local memory and need only a global declaration to be
correctly distributed among the processors.
Conclusion and Future Work:
Parallel implementations of various computer graphics algorithms like Z-Buffer, Shadow
Mapping and Ray Tracing achieve good speed up compared to their sequential counter
parts. The proposed algorithm, though untested, promises to deliver satisfactory results and
overcomes the inadequacies of the z-buffer and ray tracing techniques. Buffering of the
output of the kD tree makes it possible to incorporate reflections, refractions, transparency
while reducing the complexity of the algorithm. This gives scope for achieving previously
unattainable image processing capabilities, preceded by extensive testing and analysis of
our algorithm.
References:
[1] Henri-Pierre Charles, Laurent Lefèvre and Serge Miguet. An optimized and loadBalanced portable parallel Zbuffer, 2007.
[2] Paul S. Heckbert and Michael Herf. Simulating Soft Shadows with Graphics
Hardware. Carnegie Mellon University,Pittsburgh. January 15,1997.
[3] Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary
Vernon, William R. Mark. Toward A Multicore Architecture for Real-time Raytracing. The University of Texas at Austin, 2008.
[4] Gordon Stoll, William R. Mark, Peter Djeu, Rui Wang, Ikrima Elhassan. Razor: An
Architecture for Dynamic Multiresolution Ray Tracing. The University of Texas at
Austin. April 26,2006.
[5] Kenneth I. Joy. THE DEPTH-BUFFER VISIBLE SURFACE ALGORITHM. University of
California,1996.
[6] Karthik Ramani, Christiaan P Gribble, Al Davis. StreamRay: A Stream Filtering
Architecture for Coherent Ray Tracing. University Of Utah,2009.
[7] Nelson Max, Keiichi Ohsaki. Rendering Trees From Precomputed Z-Buffer Views.
University Of California, Davis.
[8] Michael Wand, Matthias Fischer, Ingmar Peter, Friedhelm Meyer auf der Heide,
Wolfgang Straber. The Randomized z-Buffer Algorithm:Interactive Rendering of
Highly Complex Scenes. Universties Of tubingen and Paderborn,2001.
[9] www.whatis.com
[10] www.wikipedia.org
Acknowledgements:
We are grateful to our college, the HOD and our faculty mentor for all the support and
encouragement we have received from them. We would also like to thank Intel for giving us
an opportunity to present this paper.
Download