Information Interfaces Research Group School of Interactive Computing Georgia Institute of Technology John Stasko Visualization for Information Analysis and Exploration Sept. 17, 2008 2 • Get out pencil & paper Exercise 3 – news, sports, financial, purchases, etc... • Computers, internet and web give people access to an incredible amount of data – There simply is more “stuff” • Society is more complex Data Explosion 4 – How do we avoid being overwhelmed? – How do we harness this data in decisiondecisionmaking processes? – How do we make sense of the data? • Confound: How to make use of the data Data Overload 5 • Transform the data into information (understanding, insight) thus making it useful to people The Challenge 6 • Visualization of data helps people understand it better Premise of my Work 7 – Much done preattentively, ie, without thought – Strong pattern recognition – Parallel – ~100 MB/s • Highest bandwidth sense Human Vision 8 • From [Card, Mackinlay Shneiderman ‘98] – “The use of computercomputer-supported, interactive visual representations of data to amplify cognition.” • Definition Visualization 9 – Insight: discovery, decision making, explanation, analysis, exploration, learning • “The purpose of visualization is insight, not pictures” – Internalize an understanding – Form a mental image of something • Really is a cognitive process • Often thought of as process of creating a graphic or an image Visualization 10 Larkin & Simon ’87 Card, Mackinlay, Shneiderman ‘98 – Role of external world in thinking and reason • External cognition aid • Pattern matching • Cognition → Perception – Provide a frame of reference, a temporary storage area • Visuals help us think Main Idea 11 – Want to know what questions to ask – Don’t have a priori questions – Don’t know what you’re looking for • Visualization most useful in exploratory data analysis – Data mining, DB queries, machine learning… • Many other techniques for data analysis When to Apply? 12 • “A picture is worth a thousand words” • “Seeing is believing” • “I see what you’re saying” Part of our Culture 13 Some quick (static) examples… 14 E. Tufte, Visual Display of Quant Info NYC Weather 2220 numbers London Subway 15 www.thetube.com True Geography 16 www.kottke.org/plus/misc/images/tubegeo.gif 17 Easy Walking Lines Added rodcorp.typepad.com/photos/art_2003/tube_walklines_final_lmfaint.html Atlanta Journal April 30, 2000 Atlanta Flight Traffic 18 19 InfoVis ‘07 20 Reinforce my point with two examples Questions: 21 Which cereal has the most/least potassium? Is there a relationship between potassium and fiber? If so, are there any outliers? Which manufacturer makes the healthiest cereals? Fiber Potassium 22 23 • What if I read the data to you? • What if you could only see one cereal’s data at a time? (e.g. some websites) Even Tougher? 24 http://astro.swarthmore.edu/astro121/anscombe.html • Coefficient of determination = 0.67 • Correlation coefficient = 0.82 • Residual sums of squared errors (about the regression line) = 13.75 • Regression sums of squared errors (variance accounted for by x) = 27.5 • Sums of squared errors (about the mean) = 110.0 • Equation of the leastleast-squared regression line is: y = 3 + 0.5x • Mean of the y values = 7.5 • Mean of the x values = 9.0 Four Data Sets The Data Sets 25 1 10.0, 8.04 8.0, 6.95 13.0, 7.58 9.0, 8.81 11.0, 8.33 14.0, 9.96 6.0, 7.24 4.0, 4.26 12.0,10.84 7.0, 4.82 5.0, 5.68 The Values 2 10.0,9.14 8.0,8.14 13.0,8.74 9.0,8.77 11.0,9.26 14.0,8.10 6.0,6.13 4.0,3.10 12.0,9.13 7.0,7.26 5.0,4.74 26 3 10.0, 7.46 8.0, 6.77 13.0,12.74 9.0, 7.11 11.0, 7.81 14.0, 8.84 6.0, 6.08 4.0, 5.39 12.0, 8.15 7.0, 6.42 5.0, 5.73 4 8.0, 6.58 8.0, 5.76 8.0, 7.71 8.0, 8.84 8.0, 8.47 8.0, 7.04 8.0, 5.25 19.0,12.50 8.0, 5.56 8.0, 7.91 8.0, 6.89 27 • What did you put on paper? Revisit Starting Exercise • Visual Analytics 28 • Information Visualization Two Related Disciplines 29 • Area emerged approximately 1990 – Statistics, databases, software, … • Using interactive computer visualizations to represent and communicate abstract data Information Visualization 30 – Interaction is crucial – Challenges of evaluation – InfoVis for the Masses • Recent research trends Information Visualization 31 • Area emerged approximately 2005 • InfoVis++ • Formal: The science of analytical reasoning facilitated by interactive visual interfaces • Informal: Using visual representations to help make decisions Visual Analytics • Positioning for an Enduring Success • Moving Research Into Practice • Production, Presentation, and Dissemination • Data Representations and Transformations • Science of Visual Representations and Interactions • Science of Analytical Reasoning • Challenges Overview of the R&D Agenda 32 • Decision sciences 33 • Comunications: Capture, Illustrate and present a message • Cognitive and Perceptual Sciences • Ontology, semantics, NLP, extraction, synthesis, … • Knowledge representation, management and discovery • Applied Mathematics • Geospatial and Temporal Sciences • Statistics, data representation and statistical graphics Visual Analytics: Beyond InfoVis Information Visualization ~1990 Academic Context 34 Visual Analytics ~2005 IEEE VAST 35 IEEE InfoVis 36 “A motivated , continuous effort to understand connections (which can be among people, places, and events) in order to anticipate their trajectories and act effectively.” – Klein, Moon and Hoffman Sensemaking 37 • Visualization for Investigative Analysis across Document Collections Jigsaw 38 Gennadiy Stepanov Sarah Williams Neel Parekh Kanupriyah Singhal Carsten Görg Zhicheng Liu Vasili Pantazopoulos + 4 new students The Jigsaw Team Pirolli & Card, ICIA ‘05 39 40 • Analysts’ span of attention for evidence and hypotheses • Cost structure of scanning and selecting items for further attention Pain Points Documents/ case reports 41 Blogs DBs • Help investigative analysts discover plans, plots and threats embedded across the individual documents in large document collections Problem Addressed Example Document 42 43 • Thesis: A plot/threat within the documents will involve a set of entities in coordination – Person, place, organization, phone number, date, license plate, etc. • Entities within the documents Our Focus 44 • Not our main research focus – Collaborate with or use tools from others – Crucial for our work • Must identify and extract entities from plain text documents Entity Identification Entities Identified 45 46 – The more documents they appear in together, the stronger the connection – Two entities are connected if they appear in a document together • Connection definition: • Entities relate/connect to each other to make a larger “story” Connections “Putting the pieces together” 47 • User actions generate events that are transmitted to and (possibly) reflected in other views • Views are highly interactive and coordinated • Multiple visualizations (views) of documents, entities, & their connections Jigsaw System Views 48 49 The Need for Pixels 50 Demo Console 51 Document View 52 List View 53 Graph View 54 Scatterplot View 55 Calendar View 56 Report Cluster View 57 Timeline View 58 Shoebox 59 60 • Transitioning system to real clients Trial Use • Reliability/uncertainty • Other types of data • Themes/concepts • Enhanced evidence marshalling 61 • Connectivity search • Collaborative version • Scalability issues • Present/browse investigation history • Geospatial View • Evaluation • Deployment • Display wall? • Web search & situational awareness • Wikipedia & Intellipedia • Entity Identification Future Work 62 • Including flexible, useful interaction is one of the best ways to do this – Not to just illustrate and reconfirm existing knowledge • Design your visualization systems and tools to facilitate analysis and exploration Take Away Point 63 • http://www.gvu.gatech.edu/ii To Learn More 64 • Some slides in this presentation borrowed from overviews of visual analytics by Jim Thomas, NVAC Director Acknowledgment 65 • Supported by NSF IISIIS-0414667 • Work conducted as part of the Southeastern Regional Visualization and Analytics Center, supported by DHS and NVAC Acknowledgments • Questions? 66 • Thanks for your attention! End