Dr A Sahu Dept of Comp Sc & Engg. IIT Guwahati 1 • Intel 945 Motherboard architecture –GMCH • • • • NVIDIA GPU Architecture ATI Readon Architecture DirectX, OpenGL, OpenCL Advance GPU from ATI and AMD –Introduction to Nvidia Cuda Programming 2 • From 20 Oct 2010 to 26 Oct 2010 Class will be at allocated Room 1201 • There will be multimedia workshop in this Seminar room 20-26 Nov 2010 3 • Graphics and Memory Controller Hub • Graphics Interface (GI) and PCI Express for Graphics card support • Host Interface (HI) – Connect to processor and support HT, IntrDelivery, 12 in-order queue, etc. • System Memory Interface (SMI) – Connected to two channel DDR2 • Direct Media Interface (DMI) – Connect to ICH7 4 Intel Pentium D Processor Support for Media Ext Card Intel GMA 950 Graphics DDR2 82945 GMCH/MCH North Bridge DDR2 PCI Express* x16 Graphics • Peripherals : HD monitor • Interfaces : Intermediate Hardware – Nvidia GPU card • Interfaces : Intermediate Software/Program – Nvidia GPU driver 5 • • • • • Char display (80x25 char, 5x7pixel=400x175) CRT Monitor (400x600, 640x480,600x800) LCD Monitor (1024x768,1280x1024,…) Graphics visually more appealing Display Line, Circle, Rectangle, Curve, Polygon – Character using this primitives – True type font RED ARROW Circle 6 Row Ctr 0 Col Ctr CLK > 1024x768x50Hz 1 2 3 4 ….. …1023 0 1 2 1024x768 Pixel LCD 767 Frame Buffer 8x3=24 Bits R B G Refresh screen 50 time a Sec 7 24 longword alpha 16 8 red green R G 0 blue Alpha represent premultiplied valued 0.5, 0, 1, 0 0, 0.5, 0 B pixel The intensity of each color-component within a pixel is an 8-bit value 8 “truecolor” graphics-modes use 4-bytes per picture-element VRAM 0 1 B G 2 3 R 4 A B 5 G 6 7 R A 8 9 B G 10 R … Video Screen 9 • GPU : specialized processor that accelerates 3D or 2D graphics primitives operations • Lots of Floating point operations • Accelerates Primitives – Line, circle, polygon, mesh, projection, sphere, 10 3D application 3D API Commands 3D API: OpenGL DirectX/3D CPU-GPU Boundary GPU Command & Data Stream Vertex Index Stream GPU Command Assembled polygon, line & points Primitive Assembly Pretransformed Vertices Programmable Vertex Processor transformed Vertices Pixel Updates Pixel Location Stream Rastereisation Interpolation Raster Operation Rastorized Pretransformed Fragments Programmable Fragment Processors Frame Buffer Transformed Fragments 11 Vertices (x,y,z) Memory System Vertex Shadder Vertex Processing Pixel Shadder Pixel Processing Texture Memory Pixel R, G,B Frame Buffer 12 The computing capacities of graphics processing units (GPUs) have improved exponentially in the recent decade. NVIDIA released a CUDA programming model for GPUs. The CUDA programming environment applies the parallel processing capabilities of the GPUs to medical image processing research. • • • • CUDA Cores 480 (Compute Unified Dev Arch) Graphics Clock (MHz) 700 Processor Clock (MHz) 1401 Texture Fill Rate (billion/sec) 42 15 • • • • Microsoft® DirectX® 11 Support – DirectX 11 GPU with Shader Model 5.0 support designed for ultra high performance in the new API’s key graphics feature NVIDIA® 3D Vision™ Surround Ready – Expand your games across three displays in full stereoscopic 3D for the ultimate “inside the game” experience NVIDIA® Surround™ also supports triple screen gaming with non-stereo displays. Interactive Ray Tracing – By tracing the path of light through a 3D scene, ray tracing to create spectacular, photo-realistic visuals. Get a glimpse into the future of gaming with ray tracing. 3-way NVIDIA SLI® Technology – Industry leading 3-way NVIDIA SLI technology offers amazing performance scaling by implementing 3-way AFR (Alternate Frame Rendering) for the world’s premier gaming solution • NVIDIA PhysX® Technology – Enabling a totally new class of physical gaming interaction for a more dynamic and realistic experience with GeForce. • NVIDIA CUDA™ Technology – CUDA technology accelerate the most demanding tasks such as video transcoding, physics simulation, ray tracing • 32x Anti-aliasing Technology – Lightning fast, high-quality anti-aliasing at up to 32x sample rates obliterates jagged edges. • NVIDIA® PureVideo® HD Technology*** – The combination of hD video dec accn and post-processing that delivers unprecedented picture clarity, smooth video, accurate color, and precise image scaling for movies and video. • PCI Express 2.0 Support. Dual-link DVI Support, HDMI 1.4 Elements of the graphics pipeline: 1. A scene description: vertices, triangles, colors, lighting 2. Transformations that map the scene to a camera viewpoint 3. “Effects”: texturing, shadow mapping, lighting calculations 4. Rasterizing: converting geometry into pixels 5. Pixel processing: depth tests, stencil tests, and other per-pixel operations. 1. 2. 3. 4. Parameters controlling design of the pipeline: Where is the boundary between CPU and GPU ? What transfer method is used ? What resources are provided at each step ? What units can access which GPU memory elements ? http://accelenation.com/?ac.id.123.2 Vertex Transforms CPU • One of the first true 3D game cards • Worked by supplementing standard 2D video card. • Did not do vertex transformations: these were done in the CPU • Did do texture mapping, z-buffering. Rasterization and Interpolation Primitive Assembly PCI Raster Operations GPU Frame Buffer • Main innovation: shifting the transformation and lighting calculations to the GPU • Allowed multi-texturing: giving bump maps, light maps, and others.. • Faster AGP bus instead of PCI Vertex Transforms AGP Primitive Assembly Rasterization and Interpolation GPU Raster Operations Frame Buffer • • Vertex Transforms AGP For the first time, allowed limited amount of programmability in the vertex pipeline Also allowed volume texturing and multisampling (for antialiasing) Primitive Assembly Rasterization and Interpolation GPU Small vertex shaders http://www.cis.upenn.edu/~suvenkat/700/ Raster Operations Frame Buffer • This generation is the first generation of fully-programmable graphics cards • Different versions have different resource limits on fragment/vertex programs Vertex Transforms AGP Primitive Assembly Programmable Vertex shader Rasterization and Interpolation Raster Operations Programmable Fragment Processor Frame Buffer Not exactly a quantum leap, but… • Simultaneous rendering to multiple buffers • True conditionals and loops • Higher precision throughput in the pipeline (64 bits end-to-end, compared to 32 bits earlier.) • PCIe bus • More memory/program length/texture accesses Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display • Primitives are processed in a series of stages • Each stage forwards its result on to the next stage • The pipeline can be drawn and implemented in different ways • Some stages may be in hardware, others in software • Optimizations & additional programmability are available at some stages • Operate STRICTLY one vertex at a time. • Take the original position of the vertex and other properties (normal, color, lights, ...) • Output the position and color of the vertex after processing, any other changed properties. • Take a “primitive with adjacency” (so, more than one vertex) plus other properties. • Allow you to create new geometry from the original adjacency information. • Interpolate the vertices in a 2D primitive to fill each pixel contained in the primitive. • Texturing and many other kinds of computation can be done. • 3D models defined in their own coordinate system (object Illumination (Shading) space) Viewing Transformation (Perspective / Orthographic) • Modeling transforms orient the models within a common Clipping coordinate frame (world space) Modeling Transformations Projection (to Screen Space) Scan Conversion (Rasterization) Object space Visibility / Display World space • Process one vertex at one time – No information on other vertices • Programmable • Transformation • Lighting • Global to eye coordinate system • Diffuse • Specular • Vertices lit (shaded) according to material properties, surface Illumination properties (normal) and light (Shading) sources Viewing Transformation (Perspective / Orthographic) • Local lighting model (Diffuse, Ambient, Phong, etc.) Clipping Modeling Transformations Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display • Maps world space to eye space • Viewing position is transformed Illumination (Shading) to origin & direction is oriented Viewing Transformation along some axis (usually z) (Perspective / Orthographic) Modeling Transformations Clipping Eye space Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display World space Modeling Transformations • Transform to Normalized Device Coordinates (NDC) Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display Eye space • Portions of the object outside the view volume (view frustum) are removed NDC • Backface culling – Remove triangles facing away from view – Eliminate ½ of the triangles in theory • Clipping against view frustum – Triangles may become quadrilaterals • From floating point range [-1, 1] x [-1, 1] to integer range [0, height-1] x [0, width-1] Modeling Transformations Illumination (Shading) • The objects are projected to the 2D image place (screen space) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display NDC Screen Space Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display • Rasterizes objects into pixels • Interpolate values as we go (color, depth, etc.) •Fragment: corresponds to a single pixel and includes color, depth, and sometimes texturecoordinate values. •Compute color and depth for each pixel •Most interesting part of GPU •Optional –(though hard to avoid) •Cache data –Hide latency from FB •Sampling/filtering –I told you this last time Modeling Transformations Illumination (Shading) Viewing Transformation (Perspective / Orthographic) Clipping Projection (to Screen Space) Scan Conversion (Rasterization) Visibility / Display • Each pixel remembers the closest object (depth buffer) • Almost every step in the graphics pipeline involves a change of coordinate system. Transformations are central to understanding 3D computer graphics. the traditional pipeline modeling animation rendering motion capture image-based rendering the new pipeline? 3D scanning 44