Information Visualization for Knowledge Discovery

advertisement
Information Visualization for
Knowledge Discovery
Ben Shneiderman
ben@cs.umd.edu
@benbendc
Founding Director (1983-2000), Human-Computer Interaction Lab
Professor, Department of Computer Science
Member, Institute for Advanced Computer Studies
University of Maryland
College Park, MD 20742
Interdisciplinary research community
- Computer Science & Info Studies
- Psych, Socio, Poli Sci & MITH
(www.cs.umd.edu/hcil)
Design Issues
•
•
•
•
•
Input devices & strategies
• Keyboards, pointing devices, voice
• Direct manipulation
• Menus, forms, commands
Output devices & formats
• Screens, windows, color, sound
• Text, tables, graphics
• Instructions, messages, help
Collaboration & Social Media
Help, tutorials, training
• Visualization
Search
www.awl.com/DTUI
Fifth Edition: 2010
Information Visualization
•
Visual bandwidth is enormous
• Human perceptual skills are remarkable
• Trend, cluster, gap, outlier...
• Color, size, shape, proximity...
•
Three challenges
• Meaningful visual displays of massive data
• Interaction: widgets & window coordination
• Process models for discovery
Business takes action
•
•
•
•
•
•
•
•
•
General Dynamics buys MayaViz
Agilent buys GeneSpring
Google buys Gapminder
Oracle buys Hyperion
Microsoft buys Proclarity
InfoBuilders buys Advizor Solutions
SAP buys (Business Objects buys
Xcelsius & Inxight & Crystal Reports )
IBM buys (Cognos buys Celequest) & ILOG
TIBCO buys Spotfire
Spotfire: Retinol’s role in embryos & vision
http://registration.spotfire.com/eval/default_edu.asp
10M - 100M pixels
Large displays
for single or multiple users
100M-pixels & more
1M-pixels & less
Small mobile devices
Information Visualization: Mantra
•
•
•
•
•
•
•
•
•
•
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
Overview, zoom & filter, details-on-demand
SciViz .
•
•
•
1-D Linear
2-D Map
3-D World
Document Lens, SeeSoft, Info Mural
•
•
•
•
Multi-Var
Temporal
Tree
Network
Spotfire, Tableau, GGobi, TableLens, ParCoords,
InfoViz
Information Visualization: Data Types
GIS, ArcView, PageMaker, Medical imagery
CAD, Medical, Molecules, Architecture
LifeLines, TimeSearcher, Palantir, DataMontage
Cone/Cam/Hyperbolic, SpaceTree, Treemap
Pajek, JUNG, UCINet, SocialAction, NodeXL
infosthetics.com
flowingdata.com
infovis.org
www.infovis.net/index.php?lang=2
Anscombe’s Quartet
1
x
2
y
3
x
y
x
4
y
x
y
10.0
8.04
10.0
9.14
10.0
7.46
8.0
6.58
8.0
6.95
8.0
8.14
8.0
6.77
8.0
5.76
13.0
7.58
13.0
8.74
13.0
12.74
8.0
7.71
9.0
8.81
9.0
8.77
9.0
7.11
8.0
8.84
11.0
8.33
11.0
9.26
11.0
7.81
8.0
8.47
14.0
9.96
14.0
8.10
14.0
8.84
8.0
7.04
6.0
7.24
6.0
6.13
6.0
6.08
8.0
5.25
4.0
4.26
4.0
3.10
4.0
5.39
19.0
12.50
12.0
10.84
12.0
9.13
12.0
8.15
8.0
5.56
7.0
4.82
7.0
7.26
7.0
6.42
8.0
7.91
5.0
5.68
5.0
4.74
5.0
5.73
8.0
6.89
Anscombe’s Quartet
1
x
2
y
3
x
y
x
4
y
x
y
10.0
8.04
10.0
9.14
10.0
7.46
8.0
6.58
8.0
6.95
8.0
8.14
8.0
6.77
8.0
5.76
13.0
7.58
13.0
8.74
13.0
12.74
8.0
7.71
9.0
8.81
9.0
8.77
9.0
7.11
8.0
8.84
11.0
8.33
11.0
9.26
11.0
7.81
8.0
8.47
14.0
9.96
14.0
8.10
14.0
8.84
8.0
7.04
6.0
7.24
6.0
6.13
6.0
6.08
8.0
5.25
4.0
4.26
4.0
3.10
4.0
5.39
19.0
12.50
12.0
10.84
12.0
9.13
12.0
8.15
8.0
5.56
7.0
4.82
7.0
7.26
7.0
6.42
8.0
7.91
5.0
5.68
5.0
4.74
5.0
5.73
8.0
6.89
Property
Value
Mean of x
9.0
Variance of x
11.0
Mean of y
7.5
Variance of y
4.12
Correlation
0.816
Linear regression
y = 3 + 0.5x
Anscombe’s Quartet
Temporal Data: TimeSearcher 1.3
•
•
•
Time series
• Stocks
• Weather
• Genes
User-specified
patterns
Rapid search
Temporal Data: TimeSearcher 2.0
•
•
•
Long Time series (>10,000 time points)
Multiple variables
Controlled precision in match
(Linear, offset, noise, amplitude)
LifeLines: Patient Histories
www.cs.umd.edu/hcil/lifelines
LifeLines2: Contrast+Creatine
LifeLines2: Align-Rank-Filter & Summarize
LifeFlow: Aggregation Strategy
Temporal
Categorical Data
(4 records)
LifeLines2 format
Tree of Event
Sequences
LifeFlow Aggregation
www.cs.umd.edu/hcil/lifeflow
LifeFlow: Interface with User Controls
Treemap: Gene Ontology
+ Space filling
+ Space limited
+ Color coding
+ Size coding
- Requires learning
(Shneiderman, ACM Trans. on Graphics, 1992 & 2003)
www.cs.umd.edu/hcil/treemap/
Treemap: Smartmoney MarketMap
www.smartmoney.com/marketmap
Market falls steeply Feb 27, 2007, with one exception
Market falls steeply Sept 22, 2011, some exceptions
Market mixed, February 8, 2008
Energy & Technology up, Financial & Health Care down
Market rises, September 1, 2010, Gold contrarians
Market rises, March 21, 2011, Sprint declines
Treemap: Newsmap (Marcos Weskamp)
newsmap.jp
Treemap: Supply Chain
www.hivegroup.com
Treemap: Spotfire Bond Portfolio Analysis
www.spotfire.com
Treemap: NY Times – Car&Truck Sales
www.cs.umd.edu/hcil/treemap/
Treemap (Voronoi): NY Times - Inflation
www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html
State-of-the-art network visualization
Network from Database Tables
www.centrifugesystems.com
Discovery Process: Systematic Yet Flexible
Preparation
• Own the problem & define the schedule
• Data cleaning & conditioning
• Handle missing & uncertain data
• Extract subsets & link to related information
SocialAction
•
•
•
Integrates statistics
& visualization
4 case studies, 4-8 weeks
(journalist, bibliometrician, terrorist analyst,
organizational analyst)
Identified desired features, gave strong positive
feedback about benefits of integration
www.cs.umd.edu/hcil/socialaction
Perer & Shneiderman, CHI2008, IEEE CG&A 2009
Footprints of Human Activity
• Footprints in sand as Caesarea
NodeXL:
Network Overview for Discovery & Exploration in Excel
www.codeplex.com/nodexl
NodeXL:
Network Overview for Discovery & Exploration in Excel
www.codeplex.com/nodexl
NodeXL: Import Dialogs
www.codeplex.com/nodexl
Tweets at #WIN09 Conference: 2 groups
WWW2010 Twitter Community
WWW2011 Twitter Community: Grouped
CHI2010 Twitter Community
www.codeplex.com/nodexl/
Flickr clusters for “mouse”
Computer
Mickey
Animal
Flickr networks
‘GOP’ tweets, clustered (red-Republicans)
No Location
Philadelphia
Patent
Tech
Navy
SBIR (federal)
PA DCED (state)
Related patent
2: Federal agency
Pharmaceutical/Medical
Pittsburgh Metro
3: Enterprise
5: Inventors
9: Universities
10: PA DCED
11/12: Phil/Pitt metro cnty
13-15: Semi-rural/rural cnty
17: Foreign countries
Westinghouse Electric
19: Other states
No Location
Philadelphia
Innovation Clusters: People, Locations, Companies
Patent
Tech
Navy
SBIR (federal)
PA DCED (state)
Related patent
2: Federal agency
Pharmaceutical/Medical
Pittsburgh Metro
3: Enterprise
5: Inventors
9: Universities
10: PA DCED
11/12: Phil/Pitt metro cnty
13-15: Semi-rural/rural cnty
17: Foreign countries
Westinghouse Electric
19: Other states
Analyzing Social Media Networks with NodeXL
I. Getting Started with Analyzing Social Media Networks
1. Introduction to Social Media and Social Networks
2. Social media: New Technologies of Collaboration
3. Social Network Analysis
II. NodeXL Tutorial: Learning by Doing
4. Layout, Visual Design & Labeling
5. Calculating & Visualizing Network Metrics
6. Preparing Data & Filtering
7. Clustering &Grouping
III Social Media Network Analysis Case Studies
8. Email
9. Threaded Networks
10. Twitter
11. Facebook
12. WWW
13. Flickr
14. YouTube
15. Wiki Networks
www.elsevier.com/wps/find/bookdescription.cws_home/723354/description
Social Media Research Foundation
Researchers who want to
- create open tools
- generate & host open data
- support open scholarship
Map, measure & understand
social media
Support tool projects to
collection, analyze & visualize
social media data.
smrfoundation.org
UN Millennium Development Goals
To be achieved by 2015
• Eradicate extreme poverty and hunger
• Achieve universal primary education
• Promote gender equality and empower women
• Reduce child mortality
• Improve maternal health
• Combat HIV/AIDS, malaria and other diseases
• Ensure environmental sustainability
• Develop a global partnership for development
29th Annual Symposium
May 23-24, 2012
www.cs.umd.edu/hcil
For More Information
•
Visit the HCIL website for 400 papers & info on videos
www.cs.umd.edu/hcil
•
•
•
Conferences & resources: www.infovis.org
See Chapter 14 on Info Visualization
Shneiderman, B. and Plaisant, C., Designing the User Interface:
Strategies for Effective Human-Computer Interaction:
Fifth Edition (2010) www.awl.com/DTUI
Edited Collections:
Card, S., Mackinlay, J., and Shneiderman, B. (1999)
Readings in Information Visualization: Using Vision to Think
Bederson, B. and Shneiderman, B. (2003)
The Craft of Information Visualization: Readings and Reflections
For More Information
•
•
•
•
•
Treemaps
• HiveGroup: www.hivegroup.com
• Smartmoney: www.smartmoney.com/marketmap
• HCIL Treemap 4.0: www.cs.umd.edu/hcil/treemap
Spotfire: www.spotfire.com
TimeSearcher: www.cs.umd.edu/hcil/timesearcher
NodeXL: nodexl.codeplex.com
Hierarchical Clustering Explorer:
www.cs.umd.edu/hcil/hce
•
•
LifeLines2:
Similan:
www.cs.umd.edu/hcil/lifelines2
www.cs.umd.edu/hcil/similan
Download