Freeing Association: Visual Query Patterns for Multidimensional Data Dissection Chris Weaver School of Computer Science and the Center for Spatial Analysis University of Oklahoma The North-East Visualization and Analytics Center Penn State University weaver@cs.ou.edu Analysis in Wonderland Cheshire Cat: Oh, by the way, if you'd really like to know, he went that way. Alice: Who did? Cheshire Cat: The White Rabbit. Alice: He did? Cheshire Cat: He did what? Alice: Went that way. Cheshire Cat: Who did? Alice: The White Rabbit. Cheshire Cat: What rabbit? Alice: But didn't you just say - I mean - Oh, dear. (from “Aliceʼs Adventures in Wonderland” by Charles Lutwidge Dodgson, 1865) 2 Overview • The Goal – “Free association” in foraging – Exploration of relationships between ad hoc categories • The Approach – Develop visual interactive techniques for expressing sequences of multidimensional set queries – Develop a “lab notebook” for recording, recalling, restoring, and relating queries and query sequences • The Results (so far) – Three reusable high-level design patterns for visual analytics • Cross-Filtered Views • Cross-Highlighted Views • Attribute Relationship Graphs – A variety of concrete examples with demonstrated utility 3 boat (geospatial cross-filtering) cell (cascading timeseries – unfinished) VAST Challenge Tools evac (cross-highlighted motion traces) 4 wiki (author+word co-occurrences) Data Dissection methods for interactively expressing sequences of multidimensional set queries by visualactively associating unique data values across multiple views • Multiple views support selection over sets of unique attribute values in multiple raw or derived data columns, across one or more tables. • Attributes are displayed in dimensionally-appropriate view(s) that supports a binary categorization of values by selection or navigation. • Users can rapidly toggle dependencies between pairs of views to pose complex drill-down set queries: effect only those values in view B that co-occur in the data with the values selected in view A • Attributes are displayed in a entity-relationship view that shows co-occurrences (relationships) between values (entities). • Users can rapidly toggle visibility of attributes and attribute relationships to dissect the data by slicing in and across data columns • Analysts can form hypotheses and follow chains of evidence by successive selection/deselection and filtering/unfiltering of values. 5 boat (cross-filtering) a visualization of boat landings and interdictions 6 Data Source: VAST 2008 Challenge, Boats Mini-Challenge (synthetic) Visualization Design: Chris Weaver, Michael Stryker, Ian Turton Cross-Filtering Queries • Group (γ) data records into sets for each unique attribute value. • Filter (φ) each set, keeping records whose attribute values match those selected in other views. • Project/visually encode (π) each value and its filtered set. • Select (σ) values/sets corresponding to brushed glyphs in the view. 7 Cross-Filtering Queries (Boat) Pre-filtering Encounters ee !ee G’ #ee V "ee $ G ed !ed G’ #ed V "ed !ec T’ #ec V "ec ee T’ !e G Casualties e $ Date T e ed number of passengers number of fatalities id? !es G’ #es V "es $ G ev !ev G’ #ev V "ev Vessel $ G er !er G’ #er V "er Resolution $ G pn !pn G’ #pn V "pn Name Passengers p ec Ship 8 T’ ec ed es er !p ed ee G ev p ee $ es T Cross-filtering Encounter t min <= t <= t max Grouping pn es ev er pn es ev er pn Two tables keyed on encounter ID (encounters, passengers) Seven effective drill-down dimensions “Variation filtering” of all dimensions on a range of dates Cross-Filtering... “I wish you wouldn’t keep appearing and vanishing so suddenly; you make one quite giddy!” “All right,” said the Cat; and this time it vanished quite slowly, beginning with the end of the tail, and ending with the grin, which remained some time after the rest of it had gone. “Well! I’ve often seen a cat without a grin,” thought Alice; “but a grin without a cat! It’s the most curious thing I ever saw in all my life!” (from “Aliceʼs Adventures in Wonderland” by Charles Lutwidge Dodgson, 1865) 9 ...Has A Problem • No automatic value deselection – Maintain selections to preserve analyst’s ad hoc groupings – Avoid filter cascades that terminate at fixed points/null sets • But selected values can become invisible! • Unexpected common case – Consequence of ‘upstream’ filtering, e.g. C on B then B on A – Multiple ways to happen when user “changes course” analytically • Problem: An “out of sight, out of mind” effect ? – Easy to forget items that are selected but invisible – Leads to misinterpretation when using visible state of views to remember query clauses • Solution: Preserve context by assuring visibility of selected items? 10 evac (cross-highlighting) a visualization of movements of RDIF-carrying health care workers and visitors 11 Data Source: VAST 2008 Challenge, Evacuation Mini-Challenge (synthetic) Visualization Design: Chris Weaver and Anthony Robinson Cross-Highlighting Queries (Evac) highlight layer only 12 But Crossing Doesn’t Help with... “Who cares for you?” said Alice, (she had grown to her full size by this time.) “You’re nothing but a pack of cards!” At this, the whole pack rose up into the air, and came flying down upon her. (from “Aliceʼs Adventures in Wonderland” by Charles Lutwidge Dodgson, 1865) 13 ...Ambiguity of Association • Example: cross-filter ships, vessels, resolutions on passengers. • Which passenger(s) on each ship? Using each vessel type? For each kind of resolution? • Crossed queries are multidimensional and disjunctive. • Visual states reflect many to many (to many ...) relationships, but only show entities. • Drill into relationships using more cross-filtering...over the set of all entity subsets? Tedious! • Directly visualize co-occurrences? 14 wiki (attribute relationship graph) a visualization of authors and language over time in a wiki edit history 15 Data Source: VAST 2008 Challenge, Wiki Edit Mini-Challenge (synthetic) Visualization Design: Chris Weaver, Chi-chun Pan, Don Pellegrino, Prasenjit Mitra Attribute Relationship Graphs A multigraph displays attribute values and their cliques — sets of co-occurrences across pairs of data columns. Users dynamically filter nodes, edges, and packs (families of nodes in convex hull wrappers) by selecting particular columns as well as arbitrary subsets of unique values in those columns. 16 Nominal Attributes Temporal Spatial KEDS Design Variations Auxiliary Views 17 Post-filter Retrosheet Cinegraph code (event) name (guest) name (home team, away team) name (movie, genre, oscar, person, role) date (event) date (visit) date & time (game) date (release) region (countries) location (hotel, residence) location (stadium) - Numerical cooperative/conflictual weight Pre-filter Hotels - capacity, attendance, temperature, wind speed box office, rating average, rating count list (data sources) - - sliders (ratings & roles thresholds) map (world) map (Pennsylvania) map (North America), rich drill-down table attribute relationship graph drill-down table - movie viewer 1-D heatmap (game count by date) histogram (rating distribution) Detail drill-down table, split time series Nested scatter plot (date vs. weight) 1-D heatmap (visit count by date) And If We Don’t Remember Our Queries... Then she began looking about, and noticed that what could be seen from the old room was quite common and uninteresting, but that all the rest was as different as possible. . . . “ They don’t keep this room so tidy as the other,” thought Alice to herself. (from “Aliceʼs Adventures in Wonderland” by Charles Lutwidge Dodgson, 1865) 18 ...Are We Doomed to Repeat Them? • Easy to get distracted by new questions – All three patterns are dangerously “effective” this way – Chains of interaction reveal many intriguing paths for exploration • Easy to forget – past queries – earlier query clauses expressed in a chain of interactions • So lots of foraging, not enough sense-making • Need visual history • Hmmm...interaction is simple, general, and happens at a moderate level of interaction abstraction... 19 Idea: Queries to Questions • “Who, what, where, when” output interface that provides a summary of query sequences • Map queries and results into a pseudo natural language – Designer-specified rule-based text generation – Parallel illustration with restorable live snapshots of visual state – Visual highlighting of text that involves attribute values and types – Inline graphics for enumerating longer sets, à la sparklines • Cross designs are particularly amenable to such mappings? – Highly symmetric in form – Medium level of abstraction – Simple interactions (select, toggle filter, toggle graph element) – Most interactions drive interesting, capturable query transitions 20 Harder Than It Seems? “Let’s consider your age to begin with—how old are you?” asked the White Queen. “I’m seven and a half exactly,” said Alice. “You needn’t say ‘exactly’,” the Queen remarked: “I can believe it without that. Now I’ll give you something to believe. I’m just one hundred and one, five months and a day.” “I can’t believe that!” said Alice. (from “Aliceʼs Adventures in Wonderland” by Charles Lutwidge Dodgson, 1865) 21 Thanks! • Alan M. MacEachren • Donna Peuquet • Anthony Robinson • Students/staff at NEVAC and the GeoVISTA Center • Collaborators on various CFV/CHV/ARG applications – Hotels: Deryck Holdsworth & David Fyfe (Penn State/GeoVISTA) – REMO/Evac: Patrick Laube – Multiple: The VAST Challenge 2008 team from NEVAC