Information Visualization for Knowledge Discovery Ben Shneiderman ben@cs.umd.edu @benbendc Founding Director (1983-2000), Human-Computer Interaction Lab Professor, Department of Computer Science Member, Institute for Advanced Computer Studies University of Maryland College Park, MD 20742 Interdisciplinary research community - Computer Science & Info Studies - Psych, Socio, Poli Sci & MITH (www.cs.umd.edu/hcil) Design Issues • • • • • Input devices & strategies • Keyboards, pointing devices, voice • Direct manipulation • Menus, forms, commands Output devices & formats • Screens, windows, color, sound • Text, tables, graphics • Instructions, messages, help Collaboration & Social Media Help, tutorials, training • Visualization Search www.awl.com/DTUI Fifth Edition: 2010 Information Visualization • Visual bandwidth is enormous • Human perceptual skills are remarkable • Trend, cluster, gap, outlier... • Color, size, shape, proximity... • Three challenges • Meaningful visual displays of massive data • Interaction: widgets & window coordination • Process models for discovery Business takes action • • • • • • • • • General Dynamics buys MayaViz Agilent buys GeneSpring Google buys Gapminder Oracle buys Hyperion Microsoft buys Proclarity InfoBuilders buys Advizor Solutions SAP buys (Business Objects buys Xcelsius & Inxight & Crystal Reports ) IBM buys (Cognos buys Celequest) & ILOG TIBCO buys Spotfire Spotfire: Retinol’s role in embryos & vision http://registration.spotfire.com/eval/default_edu.asp 10M - 100M pixels Large displays for single or multiple users 100M-pixels & more 1M-pixels & less Small mobile devices Information Visualization: Mantra • • • • • • • • • • Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand Overview, zoom & filter, details-on-demand SciViz . • • • 1-D Linear 2-D Map 3-D World Document Lens, SeeSoft, Info Mural • • • • Multi-Var Temporal Tree Network Spotfire, Tableau, GGobi, TableLens, ParCoords, InfoViz Information Visualization: Data Types GIS, ArcView, PageMaker, Medical imagery CAD, Medical, Molecules, Architecture LifeLines, TimeSearcher, Palantir, DataMontage Cone/Cam/Hyperbolic, SpaceTree, Treemap Pajek, JUNG, UCINet, SocialAction, NodeXL infosthetics.com flowingdata.com infovis.org www.infovis.net/index.php?lang=2 Anscombe’s Quartet 1 x 2 y 3 x y x 4 y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Anscombe’s Quartet 1 x 2 y 3 x y x 4 y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Property Value Mean of x 9.0 Variance of x 11.0 Mean of y 7.5 Variance of y 4.12 Correlation 0.816 Linear regression y = 3 + 0.5x Anscombe’s Quartet Temporal Data: TimeSearcher 1.3 • • • Time series • Stocks • Weather • Genes User-specified patterns Rapid search Temporal Data: TimeSearcher 2.0 • • • Long Time series (>10,000 time points) Multiple variables Controlled precision in match (Linear, offset, noise, amplitude) LifeLines: Patient Histories www.cs.umd.edu/hcil/lifelines LifeLines2: Contrast+Creatine LifeLines2: Align-Rank-Filter & Summarize LifeFlow: Aggregation Strategy Temporal Categorical Data (4 records) LifeLines2 format Tree of Event Sequences LifeFlow Aggregation www.cs.umd.edu/hcil/lifeflow LifeFlow: Interface with User Controls Treemap: Gene Ontology + Space filling + Space limited + Color coding + Size coding - Requires learning (Shneiderman, ACM Trans. on Graphics, 1992 & 2003) www.cs.umd.edu/hcil/treemap/ Treemap: Smartmoney MarketMap www.smartmoney.com/marketmap Market falls steeply Feb 27, 2007, with one exception Market falls steeply Sept 22, 2011, some exceptions Market mixed, February 8, 2008 Energy & Technology up, Financial & Health Care down Market rises, September 1, 2010, Gold contrarians Market rises, March 21, 2011, Sprint declines Treemap: Newsmap (Marcos Weskamp) newsmap.jp Treemap: Supply Chain www.hivegroup.com Treemap: Spotfire Bond Portfolio Analysis www.spotfire.com Treemap: NY Times – Car&Truck Sales www.cs.umd.edu/hcil/treemap/ Treemap (Voronoi): NY Times - Inflation www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html State-of-the-art network visualization Network from Database Tables www.centrifugesystems.com Discovery Process: Systematic Yet Flexible Preparation • Own the problem & define the schedule • Data cleaning & conditioning • Handle missing & uncertain data • Extract subsets & link to related information SocialAction • • • Integrates statistics & visualization 4 case studies, 4-8 weeks (journalist, bibliometrician, terrorist analyst, organizational analyst) Identified desired features, gave strong positive feedback about benefits of integration www.cs.umd.edu/hcil/socialaction Perer & Shneiderman, CHI2008, IEEE CG&A 2009 Footprints of Human Activity • Footprints in sand as Caesarea NodeXL: Network Overview for Discovery & Exploration in Excel www.codeplex.com/nodexl NodeXL: Network Overview for Discovery & Exploration in Excel www.codeplex.com/nodexl NodeXL: Import Dialogs www.codeplex.com/nodexl Tweets at #WIN09 Conference: 2 groups WWW2010 Twitter Community WWW2011 Twitter Community: Grouped CHI2010 Twitter Community www.codeplex.com/nodexl/ Flickr clusters for “mouse” Computer Mickey Animal Flickr networks ‘GOP’ tweets, clustered (red-Republicans) No Location Philadelphia Patent Tech Navy SBIR (federal) PA DCED (state) Related patent 2: Federal agency Pharmaceutical/Medical Pittsburgh Metro 3: Enterprise 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries Westinghouse Electric 19: Other states No Location Philadelphia Innovation Clusters: People, Locations, Companies Patent Tech Navy SBIR (federal) PA DCED (state) Related patent 2: Federal agency Pharmaceutical/Medical Pittsburgh Metro 3: Enterprise 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries Westinghouse Electric 19: Other states Analyzing Social Media Networks with NodeXL I. Getting Started with Analyzing Social Media Networks 1. Introduction to Social Media and Social Networks 2. Social media: New Technologies of Collaboration 3. Social Network Analysis II. NodeXL Tutorial: Learning by Doing 4. Layout, Visual Design & Labeling 5. Calculating & Visualizing Network Metrics 6. Preparing Data & Filtering 7. Clustering &Grouping III Social Media Network Analysis Case Studies 8. Email 9. Threaded Networks 10. Twitter 11. Facebook 12. WWW 13. Flickr 14. YouTube 15. Wiki Networks www.elsevier.com/wps/find/bookdescription.cws_home/723354/description Social Media Research Foundation Researchers who want to - create open tools - generate & host open data - support open scholarship Map, measure & understand social media Support tool projects to collection, analyze & visualize social media data. smrfoundation.org UN Millennium Development Goals To be achieved by 2015 • Eradicate extreme poverty and hunger • Achieve universal primary education • Promote gender equality and empower women • Reduce child mortality • Improve maternal health • Combat HIV/AIDS, malaria and other diseases • Ensure environmental sustainability • Develop a global partnership for development 29th Annual Symposium May 23-24, 2012 www.cs.umd.edu/hcil For More Information • Visit the HCIL website for 400 papers & info on videos www.cs.umd.edu/hcil • • • Conferences & resources: www.infovis.org See Chapter 14 on Info Visualization Shneiderman, B. and Plaisant, C., Designing the User Interface: Strategies for Effective Human-Computer Interaction: Fifth Edition (2010) www.awl.com/DTUI Edited Collections: Card, S., Mackinlay, J., and Shneiderman, B. (1999) Readings in Information Visualization: Using Vision to Think Bederson, B. and Shneiderman, B. (2003) The Craft of Information Visualization: Readings and Reflections For More Information • • • • • Treemaps • HiveGroup: www.hivegroup.com • Smartmoney: www.smartmoney.com/marketmap • HCIL Treemap 4.0: www.cs.umd.edu/hcil/treemap Spotfire: www.spotfire.com TimeSearcher: www.cs.umd.edu/hcil/timesearcher NodeXL: nodexl.codeplex.com Hierarchical Clustering Explorer: www.cs.umd.edu/hcil/hce • • LifeLines2: Similan: www.cs.umd.edu/hcil/lifelines2 www.cs.umd.edu/hcil/similan