Interactive Information Visualization of One Million Items Jean-Daniel Fekete University of Maryland Scaling issues in Information Visualization • Seeing more data items or more dimensions • No aggregation, no sampling • What are the limits? • Technical • screen resolution / dimension, 10ms redisplay speed • Perceptual • visual system accuracy, perception-action loop speed • Cognitive • how much can we understand and how long does it take? Visualizing one million items • Treemap of a Unix file system containing 1 million files • Rectangle sizes related to file sizes • Color coded by type: red=executable, blue=text, green=image, yellow=program, gray=unknown • What can we see? Two similar patterns = two versions of the mathlab system Gray rectangle is a bug, temporary files taking 10% of the www space Blue and green patterns are web pages (www site) Image repository for PhotoMesa Techniques • Use accelerated graphics with OpenGL • 2GHz Pentium4 • 1600x1200 pixels resolution • Now off-the-shelf! • Push existing visualization techniques to their limits • Space filling (treemaps) • Overlapping (scatter plots) Relying on Accelerated Graphics • Balance the CPU/GPU work • GPU can perform many operations “for free” • • • • • Geometric transformations Color transformations Color interpolation Translucency Counting overlaps CPU prepares data and sends it to GPU • Bottleneck is communication CPU GPU Screen Relying on Accelerated Graphics • Breaks the 106 barrier • 1 million items at interactive speed • Permits use of animation • E.g. for understanding view transitions • But requires: • optimizing algorithms • using unusual programming techniques • adapting visualization techniques Example of Adapted Visualization Techniques • No rectangle outlines • Spares pixels • Avoids sending the geometry twice • Color shading • Separate similar items • “Free” with accelerated graphics cards Animated Transitions Dynamic Labeling Conclusion • You can now break the 106 barrier! • Was limited to 104 • E.g. can visualize the phylogenic tree of species • Still technically limited by graphics hardware, but close to the perceptual limits • New IBM screen with 10 million pixels • Need more work to understand how humans can make sense of this amount of data • Send your 106 data sets! Credits • Thanks to HCIL for inviting me and providing the rich environment for this work • Thanks to Catherine Plaisant, Ben Shneiderman and Ben Bederson for their help and advice • www.cs.umd.edu/hcil/millionvis