Sort-Last Parallel Rendering for Viewing Extremely Large Data Sets on Tile Displays Paper by Kenneth Moreland, Brian Wylie, and Constantine Pavlakos Presented by Adam Howard CS594 Spring 2002- Dr. Jian Huang Research Focus/Goal Focus: • Develop highly scalable rendering techniques. Goal: • Drive multiple tile displays with frame rates that are comparable to a single tile display using a sort-last based parallel algorithm that scales appropriately with large data sets. Background Sort-Last Parallel Rendering: • Combines images after rasterization occurs. Overview of Approach The Situation: Requirement (Output): • Target resolution of 12 million pixels or more. Problem: • Beyond capabilities of a single commodity computer. Solution: • Tiled display where input comes from graphics engines on different computers. The Approach: Rather than render a single high-resolution image, each processor generates images for the tiles that make up the display. More precisely, use N processors to render and compose a large data set with T different projections, one for each display tile. System Organization VIEWS Communication Tile Display Projector Data Set RiCky System Area Network Node Equipment: Compaq 750 nVidia Geforce 256 Graphics Card System Organization Design Strategy: • Allow any number of N processors to contribute to rendering T images for a tile as long as N>=T. Software: • Intended to draw polygons that are evenly distributed amongst all the processors. System Organization Main Point of Paper Composting Strategies • • • • Serial Virtual Trees Tile Split and Delegate Reduce to Single Tile Serial Compose T images for a tile display by serially running a composition algorithm for a single display T times. Weakness: • No advantage from spatial coherence. • Load balancing. Virtual Trees • • • • • Based on Binary Tree Algorithm. Each tile image has a tree that has processors assigned to it. Processors assigned to more than one tree- when finished with one job- start another. Processor scheduling is very important. Weakness: Load balancing. Tile Split and Delegate Attempt to achieve better load balancing throughout composting. Extension of the direct send algorithm. Load balancing is ensured. Weakness: Large amount of message passing. Number of messages is O(N2). Reduce to Single Tile - Attempt to reduce the problem to that of composing a single image in the same manor as traditional sort-last parallel rendering systems. Before composting begins, each processor holds between zero and T images for separate tiles. The goal is for each processor to have one image for a particular tile. Advantages: • Good load balancing. • Fewer messages- Order of O(N*T+NlogN). Optimizations • Bucketing – • Reduce number of polygons sent to graphics hardware. Active Pixel Encoding – • Reduce amount of information passed over the network Floating Viewport – Reduce number of times a polygon is rendered and the number of times the frame buffer is read back. Experimental Results The serial strategy has good results when the data is not spatially coherent. Tile Split and Delegate and Reduce to One strategies were the best for spatially coherent data. Determined that there is a tradeoff between display resolution and rendering time. Other parallel cluster systems can render larger data sets faster, but not at this level of resolution. Experimental Results Experimental Results Experimental Results Experimental Results Conclusion The results support the initial goal of increasing resolution by rendering to a tiled display by using a cluster of commodity computers. It also supports the desire for scalability- such as larger data sets or higher resolution displays. Pretty Pictures Bucketing Reduce number of polygons sent to the graphics hardware by estimating which polygons can be ignored. • • • Before rendering- each processor’s polygons are grouped into several 3D regions called buckets. Occurs when data loaded during initialization. Before each tile image is rendered, the buckets are tested to determine which lie in the tile. Only the polygons in these buckets are rendered. Weakness: A large number of buckets reduces rendering time, but increases overhead in determining screen projections. They ended up using a moderate amount of buckets to reduce rendering time. Active Pixel Encoding This method simply reduces the amount of information that is sent across the network by making a distinction between active pixels that contain geometric information and inactive pixels. Active pixel information is longer than inactive pixel information and this reduces the overall overhead of message passing between processors. Floating Viewport Here a virtual tile is created to encompass an entire polygon. After processing it is split and each piece is displayed directly on each real tile it is actually on. Hence the system does not need to render any polygon more than once, and the frame buffer is read back one time instead of four. This is most effective when the ratio of tiles to processors decreases and the data has good spatial coherency.