Parallel Coordinates ++ CS 4460/7450 - Information Visualization Feb. 2, 2010 John Stasko Last Time • Viewed a number of techniques for portraying low-dimensional data (about 3<x<20) scatterplot matrix Table Lens sliding rods Attribute Explorer Dust & Magnet etc. Spring 2010 CS 4460/7450 2 1 Parallel Coordinates • What are they? Explain… Spring 2010 CS 4460/7450 3 Parallel Coordinates V1 V2 V3 V4 V5 D1 7 3 4 8 1 D2 2 7 6 3 4 D3 9 8 1 4 2 Spring 2010 CS 4460/7450 4 2 Parallel Coordinates 10 V1 V2 V3 V4 V5 D1 7 9 3 4 8 1 8 7 6 5 4 3 2 1 0 Spring 2010 CS 4460/7450 5 Parallel Coordinates 10 V1 V2 V3 V4 V5 D2 2 9 7 6 3 4 8 7 6 5 4 3 2 1 0 Spring 2010 CS 4460/7450 6 3 Parallel Coordinates 10 V1 V2 V3 V4 V5 D3 9 9 8 1 4 2 8 7 6 5 4 3 2 1 0 Spring 2010 CS 4460/7450 7 Parallel Coordinates Encode variables along a horizontal row Vertical line specifies different values that variable can take Data point represented as a polyline V1 V2 Spring 2010 V3 V4 V5 CS 4460/7450 8 4 Parallel Coords Example Basic Grayscale Color Spring 2010 CS 4460/7450 9 Issue • Different variables can have values taking on quite different ranges • Must normalize all down (e.g., 0->1) Spring 2010 CS 4460/7450 10 5 Application • System that uses parallel coordinates for information analysis and discovery • Interactive tool Can focus on certain data items Color Taken from: A. Inselberg, “Multidimensional Detective” InfoVis „97, 1997. Spring 2010 CS 4460/7450 11 Discuss • What was their domain? • What was their problem? • What were their data sets? Spring 2010 CS 4460/7450 12 6 The Problem • VLSI chip manufacture • Want high quality chips (high speed) and a high yield batch (% of useful chips) • Able to track defects • Hypothesis: No defects gives desired chip types • 473 batches of data Spring 2010 CS 4460/7450 13 The Data • 16 variables X1 - yield X2 - quality X3-X12 - # defects (inverted) X13-X16 - physical parameters Spring 2010 CS 4460/7450 14 7 Parallel Coordinate Display yield & quality defects parameters Yikes! But not that bad Distributions x1 - normal x2 - bipolar Spring 2010 CS 4460/7450 15 Top Yield & Quality split defects Have some defects Spring 2010 CS 4460/7450 16 8 Minimal Defects Not the highest yields and quality Spring 2010 CS 4460/7450 17 CS 4460/7450 18 Best Yields Appears that some defects are necessary to produce the best chips Non-intuitive! Spring 2010 9 XmdvTool Toolsuite created by Matthew Ward of WPI Includes parallel coordinate views Spring 2010 CS 4460/7450 19 ParVis System Demo http://www.mediavirus.org/parvis/ Spring 2010 CS 4460/7450 20 10 Challenges Too much data Out5d dataset (5 dimensions, 16384 data items) Spring 2010 CS 4460/7450 (courtesy of J. Yang) 21 Dimensional Reordering Which dimensions are most like each other? Same dimensions ordered according to similarity Yang et al InfoVis ‟03 Spring 2010 CS 4460/7450 22 11 Dimensional Reordering Can you reduce clutter and highlight other interesting features in data by changing order of dimensions? Peng et al InfoVis „04 Spring 2010 CS 4460/7450 Reducing Density 23 Jerding and Stasko, ‟95, ‟98 Wegman & Luo, ‟96 Artero et al, 04 Johansson et al, „05 Spring 2010 CS 4460/7450 24 12 Improved Interaction • How do we let the user select items of interest? • Obvious notion of clicking on one of the polylines, but how about something more than that Spring 2010 CS 4460/7450 25 Attribute Ratios • Angular Brushing Select subsets which exhibit a correlation along 2 axes by specifying angle of interest Hauser, Ledermann, & Doleisch InfoVis „02 (earlier demo) Spring 2010 CS 4460/7450 26 13 Range Focus • Smooth Brushing Specify a region of interest along one axis Spring 2010 CS 4460/7450 27 Combining • Composite Brushing Combine brushes and DOI functions using logical operators Spring 2010 CS 4460/7450 28 14 Video http://www.vrvis.at/via/research/ang-brush/parvis4.mov Spring 2010 CS 4460/7450 29 Application http://www.syracuse.com/news/index.ssf/2010/01/data_mining_helps_new_york_cat.html Spring 2010 CS 4460/7450 30 15 Different Kinds of Data • How about categorical data? Can parallel coordinates handle that well? Spring 2010 CS 4460/7450 31 Parallel Sets • Visualization method adopting parallel coordinates layout but uses frequencybased representation • Visual metaphor Layout similar to parallel coordinates Continuous axes replaced with boxes • Interaction User-driven: User can create new Kosara, Bendix, & Hauser classifications TVCG „05 Spring 2010 CS 4460/7450 32 16 Representation Color used for different categories Those values flow into the other variables Spring 2010 CS 4460/7450 33 CS 4460/7450 34 Example Titanic passengers data set Spring 2010 17 Titanic Data Set Spring 2010 CS 4460/7450 35 CS 4460/7450 36 Interactions Spring 2010 18 Video Spring 2010 CS 4460/7450 InfoVis „05 37 Star Plots Var 1 Var 5 Var 2 Value Var 4 Each “spoke” encodes a variable‟s value Var 3 Alternative Rep. Spring 2010 Space out the n variables at equal angles around a circle Data point is now a “shape” CS 4460/7450 38 19 Star Plot examples http://seamonkey.ed.asu.edu/~behrens/asu/reports/compre/comp1.html Spring 2010 CS 4460/7450 39 Star Coordinates • Same ideas as star plot • Rather than represent point as polyline, just accumulate values along a vector parallel to particular axis • Data case then becomes a point Spring 2010 CS 4460/7450 40 20 Star Coordinates E. Kandogan, “Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions”, InfoVis 2000 Late-Breaking Hot Topics, Oct. 2000 Spring 2010 Demo CS 4460/7450 41 Star Coordinates • Data cases with similar values will lead to clusters of points • (What‟s the problem though?) • Multi-dimensional scaling or projection down to 2D • Good segue to next time… Spring 2010 CS 4460/7450 42 21 Parallel Coordinates • Technique Strengths? Weaknesses? Spring 2010 CS 4460/7450 43 Design Project • For graduate students • Group of 2-4 students Can argue for singleton • First milestone: Teams and topics in 2 weeks • Choose from list of topics Can argue for your own Spring 2010 CS 4460/7450 44 22 Upcoming • High-dimensional data representations Reading: Yang et al, InfoVis ‟04 • InfoVis Systems & Toolkits Reading Viegas et al, TVCG „07 Spring 2010 CS 4460/7450 45 23