High-Performance Rendering with Advanced Shading on Commodity Graphics Clusters Motivation The core problem of computer graphics is rendering: visually depicting a simulation or model through computational means. Without powerful visualization, applications like scientific computation and simulation, which stand to revolutionize science in the coming decades, are virtually useless. TURN THIS PREVIOUS SENTENCE INTO SOMETHING POSITIVE rather than a negative. (“vital components” or “enabling technologies” etc.) Similarly, without realistic rendering, architectural visualization and film special effects could not fulfill their purpose of depicting reality. Rendering is essential to making computational models and simulations tangible and accessible to the human mind, be that the mind of a scientist, an architect, or a movie-goer. Real-world rendering systems include not only an encoding of geometry – the parameters of surfaces and volumes and their locations in space – but also a specialized encoding of the interaction of light with these surfaces which yields their actual rendered appearance. The computation of this interaction is known as shading. The complexity of this interaction, and the high frequency with which it must be computed to render a single useful image, makes realistic rendering extremely computationally intensive, and leaves advanced shading prohibitively expensive for applications like scientific visualization requiring rapid rendered feedback. Problem Statement Complex shading accounts for over 90% of the computation of realistic or visually significant images. This computation has historically made realistic rendering extremely computationally expensive, and therefore slow. By contrast, the rendering of scientific simulations and models is most useful for aiding a researcher’s comprehension and analysis when it is interactive, and therefore such systems have focused on fast, computationally affordable rendering techniques. By mapping complex shading efficiently to large parallel GPU systems, this proposal seeks to make complex rendering feasible at rates far faster than can be achieved today, thereby greatly accelerating applications requiring realistic rendering while making far greater realism and complex shading accessible to applications requiring rapid feedback. This research will also investigate the nature of the specialized hardware architectures it employs, and the complex shading techniques necessary for useful or realistic images. Background and Proposed Work The computer entertainment industry has revolutionized the economics of specialized graphics processors (GPUs). Having made simple interactive rendering ubiquitous in today’s PCs, GPUs have become increasingly flexible and programmable in the last three years. Today they are sufficiently programmable to perform much of the computation necessary for highly complex shading and realistic rendering. Simultaneously, the mass market economics of GPUs combined with their massively parallel architecture has made their performance scale far faster than even CPU performance – doubling every 6-9 months, where CPUs double every 18-24 months – to the point where $30 GPUs are now 10 times as computationally powerful as the fastest CPUs on the market, which in turn cost 20 to 100 times as much [“Why are GPUs so fast?” table]. At the same time, the interconnection technologies which link GPUs to systems and which link multiple systems together are rapidly improving. Over the next nine months these critical interconnections in commodity PCs will jump to between two and four times previous commodity interconnect performance [PCI Express vs. AGP, Infiniband vs. Ethernet]. When next-generation interconnects enable extremely high-performance GPUs to work efficiently in parallel on complex shading computations, highly advanced and realistic rendering which today requires many minutes or hours on conventional CPUs will run at greatly accelerated rates – often at or near real-time – on inexpensive commodity systems. I think one important point is that PCIX will allow multiple GPUs in a single system in a much much easier manner than what one can currently do with AGP. Many applications rely on either complex shading and realistic rendering or on very rapid interactive rendering. Special effects employ highly complex models for the interaction of materials and light, known as shading, to realistically reproduce an artificial reality for moviegoers. Similarly, architectural visualization employs highly realistic rendering of light and material interaction to accurately preview and demonstrate architectural designs. In such applications, shading is massively computationally expensive, accounting for 90-99% of the total rendering computation. Scientific simulation and visualization, meanwhile, employ fast rendering and feedback to enable scientists to tangibly interact with and better comprehend scientific processes and information. Where scientific rendering focuses foremost on interaction with and comprehension of complex models and simulations, and therefore necessitates a fast, dynamic rendering process, special effects has historically been able to tolerate the inordinately high cost of realistic shading because its sole focus is maximizing the realism of a single sequence of images which will not change between successive presentations. Similarly, in architectural visualization nearly any computational cost is less than the cost of building an accurate physical model or an actual building. This paragraph seems a little redundant compared to the introduction …. Maybe you can move some of this paragraph into the introduction and condense a little? You want to go in the order Motivation, Problem Statement, Background, Research Plan (at least that makes sense for this proposal). By providing a quantitative leap in rendering performance, highly scalable accelerated hardware shading systems will revolutionize the process and economics of film production, architectural visualization, and other applications of realistic rendering, enabling far more interactive design. Similarly, such systems also stand to revolutionize scientific visualization, and any other applications requiring visualization at rapid, dynamic rates, by enabling them to employ far more advanced or realistic rendering, producing a radical qualitative leap in the detail, complexity, and realism of their interactive rendering. My undergraduate honors thesis research has focused on the efficient computation of advanced shading on GPUs for special effects applications. I set out to research a unique technological solution to the unsolved problem of interactive lighting design in computer graphics production. Through a combination of novel compiler analysis, precomputation, and cross-compilation I have mapped critical portions of advanced film shading computations to GPUs for the special-purpose application of lighting design preview. Film-level shading and rendering is often far more complex than what can be directly rendered by conventional means on today’s GPUs. My solution applies the concept of shader specialization to the fact that lighting designers need only recompute the portion of rendering and shading which is dependent on the parameters of the lights they are changing [MSR specializing shaders]. My solution works on the principle that it is possible to efficiently perform complex shading computation on GPUs – which might otherwise be impossible or inefficient – by segmenting the computation into independent sub-computations, and separately analyzing the flow of data between these computations. Chan et al. have similarly mapped certain classes of complex shading computations to GPUs through subdivision of the computation into multiple independent steps using a compiler technique they call RDS [RDS]. But where RDS techniques focus on simplifying lengthy or resource intensive computations, my work enables broad classes of computation which were previously not addressed – such as datadependent loops and conditionals – to be mapped to the GPU in many situations. Taken together, these techniques suggest a combined compiler approach which could effectively subdivide complex shading computations in a more general fashion to map the full shading process – not just light-dependent computation – to clusters of parallel GPUs. Where Chan’s RDS technique emulates single long shading computations through a series of smaller computations on one GPU, it could be generalized to emulate a single massive shading computation through a series of computational steps performed by a series of collaborating nodes in a cluster, passing results from one step to the next like a bucket brigade. In this way, all the GPUs in a cluster could act in concert as a single massive shading processor. The core challenge my research will address is performing advanced shading computation efficiently across many parallel GPUs. Additionally, because of the ways in which efficient GPU algorithms differ from known CPU algorithms, mapping advanced rendering to GPUs will reveal many new ideas about the nature of advanced shading and rendering. And similarly, the application of GPUs and parallel systems to different, larger, and more complex computations than they have previously performed will reveal future architectural enhancements which will improve the performance of GPUs in many novel applications. Conclusion My graduate research will focus on the application of commodity GPU clusters to the acceleration and interactive rendering of advanced shading. Previous work has either focused on performing highly simplified rendering at interactive rates on GPUs or parallel systems, or performing complex, realistic rendering on CPUs at rates orders of magnitude slower. By efficiently mapping realistic rendering and shading to GPUs I will radically accelerate applications requiring highly realistic rendering – potentially to near interactive rates – while simultaneously enabling the application of advanced techniques from realistic rendering in applications requiring interactive rates. In so doing I will research both the nature of parallel rendering architectures and the nature and applications of advanced shading and rendering. Finally, by taking advantage of the far more rapid performance growth of GPUs compared to conventional CPUs, I will enable the realism and power of advanced rendering to scale far faster than it ever has before. The Computer Graphics Group at Stanford University is an ideal team in which to perform this research. Under Pat Hanrahan, architect of the predecessor to nearly all of today’s advanced shading systems, the group has a history at the pinnacle of advanced shading and rendering research. More recently, however, it has become the center of research in commodity parallel rendering systems and advanced applications of GPUs. If you wanted to save space, you could just list projects below without names. Mike Houston continues to research scalable rendering on cluster systems, and has just built the first such next-generation GPU cluster as I describe. Kekoa Proudfoot, Pradeep Sen, and Ren Ng developed the first general shading language for modern GPUs including the RDS compiler technique, and continue this work. Meanwhile Tim Purcell created the first ever complete ray tracing and photon mapping renderers built on modern GPUs. Finally, Ian Buck is exploring the programming of GPUs for general-purpose computation. The group is tightly connected with the leading application users in scientific computation and visualization and entertainment and so offers the strongest environment available in which to learn from these applications and maximize the impact of my research on real-world science, architecture, entertainment, and other applications. Unused prose Real-world objects look as they do not because of their “color,’ but because of the immense complexities of the microstructural interactions of material and light. Black plastic, black metal, and black hair look radically different because of differences in their surface microstructure which arise from how they are molded, machined, or grown, and because of differences in their dielectric properties and otherwise. These interactions cannot be practically represented as geometry because they occur at scales orders of magnitude below the major geometrical features of an object. Thus, the interaction of material and light requires a separate encoding, generally known as shading. The entertainment industry is advanced far beyond the visualization community in practically simulating and visualizing reality because their work must be passed off as reality before millions of critical eyes. To do so, above all, they have uniquely mastered the complexities of shading. But because of the complexity of realistically modeling the physical world, shading is massively computationally expensive, accounting for 90% or more of the total rendering computation – often taking as long as hours to render a single filmic image. Entertainment has historically been able to tolerate this inordinately high cost because its sole focus is maximizing the realism of a single static presentation. Scientific and technical rendering, however, focus foremost on interaction with and comprehension of complex models and simulations. They therefore emphasize interactivity and feedback, necessitating a fast, dynamic rendering process.