Workshop on Parallel Visualization and Graphics Chromium Mike Houston, Stanford University and The Chromium Community Workshop on Parallel Visualization and Graphics How Chromium works Replaces system’s OpenGL driver • Industry standard API • Support existing unmodified applications Manipulates streams of API commands • • • • Alter/inject/discard commands and parameters Route commands over a network Render commands using graphics hardware State tracking Allows parallel applications to issue OpenGL • Constrain ordering between multiple streams Workshop on Parallel Visualization and Graphics 2 Graphics Stream Processing Treat OpenGL calls as a stream of commands Form a DAG of stream transformation nodes • Nodes are computers in a cluster • Edges are OpenGL API communication Each node has a serialization stage and a transformation stage Workshop on Parallel Visualization and Graphics 3 Stream Serialization Convert multiple streams into a single stream Efficiently context-switch between streams Constrain ordering using Parallel OpenGL extensions [Igehy98] Two kinds of serializers: • Network server: S • Application: • Unmodified serial application • Custom parallel application A OpenGL Workshop on Parallel Visualization and Graphics 4 Stream Transformation Serialized stream is dispatched to “Stream Processing Units” (SPUs) Each SPU is a shared library • Exports a (partial) OpenGL interface Each node loads a chain of SPUs at run time SPUs are generic and interchangeable Workshop on Parallel Visualization and Graphics 5 SPU Chains SPUs are loaded as parts of linear chains Common usage: intercept a few OpenGL calls, pass all others to downstream SPU Useful for simple state changes, such as “wireframe” drawing Workshop on Parallel Visualization and Graphics 6 Output Scalability (Sort-First) App Server Display Server Display Server Display .. . .. . Larger displays with unmodified applications Other possibilities: broadcast, ring network Workshop on Parallel Visualization and Graphics 7 Example: Sort-First Server Render App Tilesort Server Render .. . Server Render Workshop on Parallel Visualization and Graphics 8 Input Scalability (Sort-Last) App App .. . Server Display App Parallel geometry extraction Parallel data submission Workshop on Parallel Visualization and Graphics 9 Example: Sort-Last Application Readback Send Application Readback Send .. . Server Render Application Readback Send Application runs directly on graphics hardware Same application can use sort-last or sort-first Workshop on Parallel Visualization and Graphics 10 SPU Inheritance The Readback and Render SPUs are related • Readback renders everything except SwapBuffers Readback inherits from the Render SPU • Override parent’s implementation of SwapBuffers • All OpenGL calls considered “virtual” Workshop on Parallel Visualization and Graphics 11 Readback’s SwapBuffers void RB_SwapBuffers(void) { self.ReadPixels( 0, 0, w, h, ... ); child.Clear( GL_COLOR_BUFFER_BIT ); child.SemaphorePCR( READBACK_SEMAPHORE ); child.RasterPos2i( tileX, tileY ); child.DrawPixels( w, h, ... ); child.SemaphoreVCR( READBACK_SEMAPHORE ); child.SwapBuffers( ); } Easily extended to include depth composite All other functions inherited from Render SPU Workshop on Parallel Visualization and Graphics 12 More Complicated Example: Hybrid App Tilesort Server Readback Send App Tilesort Server Readback Send App Tilesort Server Readback Send .. . Workshop on Parallel Visualization and Graphics .. . Server Render 13 Networks Supported TCP/UDP Myrinet Quadrics Infiniband (coming soon) Workshop on Parallel Visualization and Graphics 14 New Things to Chromium Extensions DMX Support Display list management (DLM) VNC Support Dale’s talk CRUT Workshop on Parallel Visualization and Graphics 15 Extensions GL_ARB_fragment_program GL_ARB_vertex_program GL_NV_fragment_program GL_NV_vertex_program GL_NV_texture_rectangle GL_EXT_shadow_funcs GL_EXT_texture_rectangle GL_IBM_raster_pos_clip Workshop on Parallel Visualization and Graphics 16 DMX Support DMX • Distributed Multi-headed X • Single X session across multiple-displays OpenGL through Chromium • Chromium “DMX aware” • Moving/resizing = retiling • M to N rendering Workshop on Parallel Visualization and Graphics 17 DMX In Action Workshop on Parallel Visualization and Graphics 18 Display List Management Display List Manager (DLM) • State tracking is really tricky • Replay state calls on client • Call list on servers • Bounding Box tracking of display list Future optimizations • Avoid broadcasting data in display list • Send calls once per server as needed Workshop on Parallel Visualization and Graphics 19 VNC X forwarding • Forwards GLX calls to client DRI bypasses X • Can’t get pixel data OpenGL apps load Chromium • Render on local host • Readback pixel data • Send to user’s display Workshop on Parallel Visualization and Graphics 20 What are people doing with Chromium? Workshop on Parallel Visualization and Graphics 21 Dynamic Screen Calibration Workshop on Parallel Visualization and Graphics 22 Quake3 Arena Niederauer, et al. Workshop on Parallel Visualization and Graphics 23 Viewed in a new way Niederauer, et al. Workshop on Parallel Visualization and Graphics 24 Architectural Analysis Intercept geometry Determine floor positions Change to orthographic view Insert clip planes at the ceilings Split floors apart Multi-pass rendering “Non-Invasive Interactive Visualization of Dynamic Architectural Environments” Christopher Niederauer, Mike Houston, Maneesh Agrawala, Greg Humphreys ACM SIGGRAPH 2003 Symposium on Interactive 3D Graphics Workshop on Parallel Visualization and Graphics 25 Batch Scheduler Integration Offline rendering to a webpage Use massive compute resources Rendering with Vis cluster Integrate support with RMS Workshop on Parallel Visualization and Graphics Pittsburg Supercomputer Center 26 Terascale Computing System Summary Control Compute Nodes Interactive File Servers /home WAN/LAN Switched ethernet Quadrics Mass Store Archive Viz • 750 Compute Nodes • 3000 EV68 processors • 6 Tf (peak, est >4Tf on LSMS) • 3. TB memory • 27 TB local disk • Multi-rail fat-tree network • Redundant monitor/ctrl • WAN/LAN accessible • Parallel visualization • File servers: 30TB, ~32 GB/s • Mass store, ~1 TB/hr buffer Workshop on Parallel Visualization and Graphics Pittsburg Supercomputer Center 27 Example qsub –l rmsnodes=3:12,other=visnodes=5 Job waits until 3 nodes (12 cpus) become available AND 5 vis nodes are available When resources available, job runs Visit vis web page for rendering Workshop on Parallel Visualization and Graphics Pittsburg Supercomputer Center 28 What coming in the next year? Workshop on Parallel Visualization and Graphics 29 General Improvements Continue to track OpenGL changes • Add extensions Optimizations • Display list management • Tilesort Software Compositors Workshop on Parallel Visualization and Graphics 30 PICA Support Parallel Image Compositing API (PICA) • API for hardware and software compositing • Will be supported by most hardware compositors Chromium support • Hooks almost complete • Need software compositors • Readback (N to 1) • Binary-swap • SLIC • Need info from hardware folks Workshop on Parallel Visualization and Graphics 31 “Vis as a service” Better integration with schedulers • Reservation systems • Compute/Render/Display Distributed event model (CRUT) Compression • Geometry data • Pixel data Encryption Workshop on Parallel Visualization and Graphics 32 Look at how much was done last year! 4 releases Constant bug fixes Constant improvements Constant optimizations Chromium is supported by a large community! Chromium is used in the real world! Workshop on Parallel Visualization and Graphics 33 Go get it! http://chromium.sourceforge.net Workshop on Parallel Visualization and Graphics 34 Acknowlegements The Chromium community Greg Humphreys Brian Paul Joel Welling Alan Hourihane DOE!!! Workshop on Parallel Visualization and Graphics 35