ppt - The University of Akron

Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.  Make great images  Intricate shapes  Complex optical effects  Seamless motion  Make them fast  Invent clever techniques  Use every trick imaginable  Build monster hardware Eugene d’Eon, David Luebke, Eric Enderton, In Proc. EGSR 2007 and GPU Gems 3 History of GPUs – Slide 2 Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading Depth Test & Blending Framebuffer History of GPUs – Slide 3 Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading Depth Test & Blending Framebuffer History of GPUs – Slide 4 Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading  Transform from Depth Test & Blending Framebuffer “world space” to “image space”  Compute per-vertex lighting History of GPUs – Slide 5 Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading Depth Test & Blending  Convert geometric representation (vertex) to image representation (fragment)  Interpolate pervertex quantities across pixels Framebuffer History of GPUs – Slide 6 Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading Depth Test & Blending Framebuffer History of GPUs – Slide 7 Vertex Rasterize Pixel Test & Blend  Key abstraction of real-time graphics  Hardware used to look like this  One chip/board per stage  Fixed data flow through pipeline Framebuffer History of GPUs – Slide 8 Vertex Rasterize  Everything fixed function with a certain number of modes  Number of modes for each stage grew over time Pixel  Hard to optimize hardware Test & Blend  Developers always wanted more Framebuffer flexibility History of GPUs – Slide 9 Vertex Rasterize  Remains a key abstraction  Hardware used to look like this  Vertex and pixel processing became Pixel Test & Blend programmable, new stages added  GPU architecture increasingly centers around shader execution Framebuffer History of GPUs – Slide 10 Vertex Rasterize Pixel Test & Blend  Exposing an (at first limited) instruction set for some stages  Limited instructions and instruction types and no control flow at first  Expanded to full ISA Framebuffer History of GPUs – Slide 11  Workload and programming model provide lots of parallelism  Applications provide large groups of vertices at once  Vertices can be processed in parallel  Apply same transform to all vertices  Triangles contain many pixels  Pixels from a triangle can be processed in parallel  Apply same shader to all pixels  Very efficient hardware to hide serialization bottlenecks History of GPUs – Slide 12 Pixel Vrtx 0 Pixel 2 Pixel 3 Vrtx 2 Vrtx 1 Blend Vertex Pixel 0 Pixel 1 Blend Raster Vertex Raster History of GPUs – Slide 13  Note that we do the same thing for lots of pixels/vertices Control Control Control Control Control Control ALU ALU ALU ALU ALU ALU Control ALU ALU ALU ALU ALU ALU  A warp = 32 threads launched together  Usually execute together as well History of GPUs – Slide 14  All this performance attracted developers  To use GPUs, re-expressed their algorithms as general purpose computations using GPUs and graphics API in applications other than 3-D graphics  Pretend to be graphics; disguise data as textures or geometry, disguise algorithm as render passes  Fool graphics pipeline to do computation to take advantage of massive parallelism of GPU  GPU accelerates critical path of application History of GPUs – Slide 15  Data parallel algorithms leverage GPU attributes  Large data arrays, streaming throughput  Fine-grain SIMD parallelism  Low-latency floating point (FP) computation  Applications – see http://GPGPU.org  Game effects (FX) physics, image processing  Physical modeling, computational engineering, matrix algebra, convolution, correlation, sorting History of GPUs – Slide 16  Dealing with graphics API Input Registers Fragment Program per thread per Shader per Context  Working with the corner cases of the graphics API Texture Constants Temp Registers  Addressing modes  Limited texture size/dimension  Shader capabilities Output Registers FB Memory  Limited outputs  Instruction sets  Lack of integer & bit ops  Communication limited  Between pixels  Scatter a[i] = p History of GPUs – Slide 17  To use GPUs, re-expressed algorithms as graphics computations  Very tedious, limited usability  Still had some very nice results  This was the lead up to CUDA History of GPUs – Slide 18  General purpose programming model  User kicks off batches of threads on the GPU  GPU = dedicated super-threaded, massively data parallel co-processor  Targeted software stack  Compute oriented drivers, language, and tools History of GPUs – Slide 19  Driver for loading computation programs into GPU  Standalone Driver - Optimized for computation  Interface designed for compute – graphics-free API  Data sharing with OpenGL buffer objects  Guaranteed maximum download & readback speeds  Explicit GPU memory management History of GPUs – Slide 20 CPU (host) GPU w/ local DRAM (device) History 21 of GPUs – Slide 21  8-series GPUs deliver 25 to 200+ GFLOPS on compiled parallel C applications  Available in laptops, desktops, and clusters GPU parallelism is doubling every year  Programming model scales transparently GeForce 8800  Tesla D870 History of GPUs – Slide 22   Programmable in C with CUDA tools Multithreaded SPMD model uses application data parallelism and thread parallelism Tesla S870 History of GPUs – Slide 23  GPUs evolve as hardware and software evolve  Five stage graphics pipelining  An example of GPGPU  Intro to CUDA History of GPUs – Slide 24  Reading: Chapter 2, “Programming Massively Parallel Processors” by Kirk and Hwu.  Based on original material from  The University of Illinois at Urbana-Champaign  David Kirk, Wen-mei W. Hwu  The University of Minnesota: Weijun Xiao  Stanford University: Jared Hoberock, David Tarjan  Revision history: last updated 5/24/2011. History of GPUs – Slide 25

ppt - The University of Akron

Related documents

Products

Support

ppt - The University of Akron

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib