Measuring and Improving the Readability of Network Visualizations Cody Dunne

advertisement
Measuring and Improving the
Readability of Network
Visualizations
Cody Dunne
cdunne@cs.umd.edu
ORNL – March 11, 2013
The Data
Problem
Why Visualization?
Anscombe’s Quartet
I
x
II
y
x
III
y
x
IV
y
x
y
10.00
8.04
10.00
9.14
10.00
7.46
8.00
6.58
8.00
6.95
8.00
8.14
8.00
6.77
8.00
5.76
13.00
7.58
13.00
8.74
13.00
12.74
8.00
7.71
9.00
8.81
9.00
8.77
9.00
7.11
8.00
8.84
11.00
8.33
11.00
9.26
11.00
7.81
8.00
8.47
14.00
9.96
14.00
8.10
14.00
8.84
8.00
7.04
6.00
7.24
6.00
6.13
6.00
6.08
8.00
5.25
4.00
4.26
4.00
3.10
4.00
5.39
19.00
12.50
12.00
10.84
12.00
9.13
12.00
8.15
8.00
5.56
7.00
4.82
7.00
7.26
7.00
6.42
8.00
7.91
5.00
5.68
5.00
4.74
5.00
5.73
8.00
6.89
Anscombe’s Quartet - Statistics
Property
Value
Equality
Mean of x in each case
9
Exact
Variance of x in each case
11
Exact
Mean of y in each case
7.50
To 2 decimal places
Variance of y in each case
4.122 or 4.127
To 3 decimal places
Correlation between x and
0.816
y in each case
Linear regression line in
each case
To 3 decimal places
To 2 and 3 decimal
y = 3.00 + 0.500x
places, respectively
Anscombe’s Quartet - Scatterplots
Networks!
Edge List
Adjacency Matrix
Node 1
Node 2
Alice
Bob
Alice
Cathy
Alice
Bob
Cathy
Alice
0
1
1
Cathy
Bob
0
0
0
Alice
Cathy
1
0
0
Tweets of the #Win09 Workshop
#
User 1
User 2
#
User 1
User 2
1 20andlife
barrywellman
15 danevans87
informor
2 20andlife
BrianDavidson
16 danevans87
NetSciWestPoint
3 barrywellman
elizabethmdaly
17 danielequercia
BrianDavidson
4 barrywellman
informor
18 danielequercia
drewconway
5 BrianDavidson
hcraygliangjie
19 danielequercia
ipeirotis
6 BrianDavidson
informor
20 danielequercia
johnflurry
7 BrianDavidson
NetSciWestPoint
21 danielequercia
loyan
8 byaber
barrywellman
22 danielequercia
loyan
9 byaber
danielequercia
23 danielequercia
mcscharf
10 byaber
mcscharf
24 danielequercia
NetSciWestPoint
11 chrisnordyke
RebeccaBadger
12 danevans87
barrywellman
106 sechrest
Japportreport
13 danevans87
BrianDavidson
107 sechrest
loyan
14 danevans87
drewconway
108 sechrest
RebeccaBadger
… …
…
Tweets of the #Win09 Workshop
Who Uses Network Analysis
Sociology
Scientometrics
Biology
Urban
Planning
Politics
Archaeology
WWW
Some of my work…
NodeXL (Smith et al.,
2009; Dunne &
Shneiderman, 2013; +5)
GraphTrail
(Dunne et al., 2012;
Riche et al., 2011)
STICK
(Shneiderman et al.,
2011; Gove et al., 2011)
Action Science Explorer
(Dunne et al., 2012;
Gove et al., 2011)
NetGrok
(Blue et al., 2008)
smrfoundation.org
NodeXL
Collect data, Excel analysis, statistics, visualization, layout
algorithms, filtering, clustering, attribute mapping…
NodeXL Graph Gallery
NodeXL as a Teaching Tool
I. Getting Started with Analyzing Social Media Networks
1. Introduction to Social Media and Social Networks
2. Social media: New Technologies of Collaboration
3. Social Network Analysis
II. NodeXL Tutorial: Learning by Doing
4. Layout, Visual Design & Labeling
5. Calculating & Visualizing Network Metrics
6. Preparing Data & Filtering
7. Clustering &Grouping
III Social Media Network Analysis Case Studies
8. Email
9. Threaded Networks
10. Twitter
11. Facebook
12. WWW
13. Flickr
14. YouTube
15. Wiki Networks
http://www.elsevier.com/wps/find/bookdescription.cws_home/723354/description
NodeXL as a Research Tool
Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First
steps to NetViz Nirvana: Evaluating social network analysis with NodeXL", In CSE '09. pp. 332-339.
DOI:10.1109/CSE.2009.120
Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from
overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608.
Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave
E (2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264.
DOI:0.1145/1556460.1556497
Research in NodeXL
Node-Link Visualization is Hard
Alternate visualizations...
Gove et al., 2011
Henry & Fekete,
2006
Freire et al., 2010
Dunne et al., 2012
Wattenberg, 2006
Better Layouts…
Hachul & Jünger, 2006
Plan of attack
Readability metrics
• Global/local
• Taxonomy/layout aids
Motif
simplifications
Evaluations
Meta-layouts
• Readability metrics
• User studies
Readability Metrics
Why measure readability?
Lee et al., 2003
Measuring Readability
Simple rules or heuristics
Davidson & Harel, 1996
User performance
Huang et al., 2007
Global readability metrics
Purchase, 2002
Source: Sugiyama, 2002, p. 14
Global Readability Metrics
• How understandable is the network drawing?
• Example: Journal may suggest
• 0% node occlusion
• <2% edge tunneling
• <5% edge crossing
E.g., Node Overlap
Global readability metric
[0,1] where:
0 = Complete overlap
1 = No overlap
Node readability metric
Ratio of node area that
overlaps other nodes
My metrics
New
Local
Node overlap
Edge tunnel
Drawing space used
Group overlap
Edge crossing
Angular resolution
Edge crossing angle
Existing
metrics
Assisted Manipulation
• Real-time ranking &
coloring by metrics
14 edge tunnels
0 edge tunnels
Images: Cody Dunne
Discussion
• Raise awareness of readability issues
• Localized identification of where improvement is
needed
• Optimization recommendations for tasks
• Interactive optimization
• Future optimization plans
Dunne C and Shneiderman B (2009), "Improving graph drawing readability by incorporating readability
metrics: A software tool for network analysts". University of Maryland. Human-Computer Interaction Lab
Tech Report No. (HCIL-2009-13).
Motif Simplification
Lostpedia articles
Observations
1: There are repeating patterns in
networks (motifs)
2: Motifs often dominate the
visualization
3: Motifs members can be
functionally equivalent
Graph Summarization…
Navlakha et al., 2008
Motif Simplification
Fan Motif
2-Connector Motif
Lostpedia articles
Lostpedia articles
Glyph Design: Fan
Glyph Design: Connector
Cliques too!
Interactivity
Fan motif: 133 leaf vertices
with head vertex “Theory”
Senate Co-Voting: 65%
Agreement
Senate Co-Voting: 70%
Agreement
Senate Co-Voting: 80%
Agreement
Senate Co-Voting: 85%
Agreement
Voson Web Crawl
Voson Web Crawl
Voson Web Crawl
User Impressions
“I’m overwhelmed, … this is like one of those
vision tests at the eye doctor”
“Now I can see the central pages…[and]
pairwise connections”
Controlled Experiment - General
• Maximal motifs:
• 21s faster***, 68% more accurate***, 28% less
size error (0%)***
• Estimating node count
• 22s slower***, 39% less error***
• Finding plain labeled nodes
• 20s faster**, 83% more accurate**
• Finding simplified labeled nodes
• 15s slower*
Controlled Experiment - Topology
• Cut points
• 35% more accurate**
• Path length
• 10s slower*, 15% less error**/19% more error*
• Neighbor compare
• 10s slower**, 19% less accurate*
• Shared neighbor count
• 18s slower***, 11% more error*
Motif Detection Algorithms
• Fans
• Straightforward
• O(N * avg. neighbor count)
• Connectors
• Handle overlapping connectors
• O(N * avg. neighbor count)
• Cliques
• Traditional clique-finding algorithms
• Choice heuristics
• O(3^(N/3))
Discussion
• Motif simplification effective for
• Reducing complexity
• Understanding larger or hidden relationships
• However
• Frequent motifs may not be covered
• Glyph design has tradeoffs
• May be challenging at first for some tasks
• Available now in NodeXL: nodexl.codeplex.com
Dunne C and Shneiderman B (2013), "Motif simplification: improving network visualization readability with
fan, connector, and clique glyphs", In CHI '13.
Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering,
grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2
Meta-Layouts
Analogy: Clusters Are Occluded
Hard to count nodes, clusters
Separate Clusters Are More
Comprehensible
Meta-Layouts
• Layout using groupings
• Attributes
• Topology
• Manual
• Good for
• Large or high density networks
• Highlighting hidden relationships
• Recursive nesting
Group-in-a-Box Meta-Layouts
• Squarified Treemap
• See topology poorly, space-filling
• Fitted Rectangles
• See topology better, slight space
increase
• Force-Directed
• See topology well, at the cost of
space
Risk Movements
Plain Layout
with Clusters
Risk Movements
GiB Treemap
GiB Fitted Rectangles:
The Donut
GiB Fitted Rectangles:
The Croissant
Risk Movements
GiB Fitted
Rectangles
(Croissant)
GiB Force-Directed
Risk Movements
GiB Force-Directed
Pennsylvania
Innovation
Pennsylvania
Innovation
GiB Treemap
Pennsylvania
Innovation
GiB Fitted
Rectangles
Pennsylvania
Innovation
GiB Force-Directed
GiB Force-Directed: Algorithm
• Start with initial area usage (20%--50%)
• Generate initial positions
• Harel & Koren, 2002
• Better to use meta-edge weights
• Remove overlaps
• Gansner & Hu, 2009
• Minimize space used
• Retain layout structure
• Scale the new layout to fit
Force-Directed GiB
Box Initial
Positions
Force-Directed GiB
Overlap Removal
20% Originally Filled
Force-Directed GiB
Overlap Removal
50% Originally Filled
Putting It All Together
Layout depends on task
requirements: space-filling
vs. showing relationships
• Treemap
• Fitted Rectangles
• Force-directed
Automatic choices:
• Disconnected
components
• Treemap outer layout
• Nested GiB layouts
• Two groups: Treemap
• Fitted rectangles
• Donut for a few large
groups
• Croissant for more evenly
distributed groups
Empirical Evaluation
• Compare techniques on 3564+ Twitter networks
• Measure readability metrics
• Edges crossing boxes unnecessarily
• Forthcoming results…
Discussion
• Three Group-in-a-Box layouts for dissecting networks
• Improved group and overview visualization
• Tradeoffs: Filling space vs. showing relationships
• Available in NodeXL: nodexl.codeplex.com
• Treemap: Available now!
• Force-Directed & Fitted Rectangles: ~6 weeks
• Real-world application
Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering,
grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2
Dunne C, Chaturvedi S, Ashktorab Z, Zacharia R, and Shneiderman B (2013), "Fitted rectangles and forcedirected group-in-a-box layouts for clustered network visualization", In preparation.
Rodrigues EM, Milic-Frayling N, Smith M, Shneiderman B, and Hansen (2011), “Group-in-a-Box layout for
multi-faceted analysis of communities”, In SocialCom ’11. pp. 354-361.
DOI:10.1109/PASSAT/SocialCom.2011.139
Better Node-Link Visualizations
Readability metrics
• Global/local
• Taxonomy/layout aids
Motif
simplifications
Evaluations
Meta-layouts
• Readability metrics
• User studies
Some of my work…
NodeXL (Smith et al.,
2009; Dunne &
Shneiderman, 2013; +5)
GraphTrail
(Dunne et al., 2012;
Riche et al., 2011)
STICK
(Shneiderman et al.,
2011; Gove et al., 2011)
Action Science Explorer
(Dunne et al., 2012;
Gove et al., 2011)
NetGrok
(Blue et al., 2008)
Future Plans
Future of Readability Metrics:
Multi-Criteria Optimization
• User-defined energy function
• Interactive view of task-by-metric taxonomy
• Simulated annealing
• Metropolis et al., 1953; Kirkpatrick et al., 1983
• Searches layout space
• Hill climbing
• Expensive, but valuable esp. for static images
Future: Network Overviews
• Identify high-level structures
• Motifs
• Clusters
• Network backbone
• Ease display, especially online
• Semantic zooming
• Interactivity
• Glyphs
Wong et al., 2008
Future: Network
Evolution
• Line charts
• Dynamic filters
• Time bins
• Heatmap slices
Future: Medical
Records
• Connections between
patents and concepts
• Fast, approximate analyses
• Continuous, random
stream of records
• Clustering
• Uncertainty visualization
Funders & Collaborators
Funding
• NSF grants SBE 0915645, IIS 0705832, IIS 0968521
• HHS SHARP grant 10510592
• Social Media Research Foundation, Connected Action Consulting Group,
Microsoft External Research, Microsoft Research, National Cancer
Institute
Co-Authors
• Ben Shneiderman, Marc Smith, Snigdha Chaturvedi, Zahra Ashktorab,
Rajan Zacharia, Tony Capone, Eduarda Mendes Rodrigues, Natasa MilicFrayling, Nathalie Riche, Bongshin Lee, Ron Metoyer, George Robertson,
Robert Gove, Bonnie Dorr, Judith Klavans, Saif Mohammad, Puneet
Sharma, Ping Wang, Awalin Sopan, Nick Gramsky, Rose Kirby, Emre Sefer,
Meirav Taieb-Maimon, Vladimir Barash, Adam Perer, Eric Gleave, Derek
Hansen, Elizabeth Bonsignore, Dana Rotman, Ryan Blue, Adam Fuchs, Kyle
King, and Aaron Schulman
Collaborators
• Catherine Plaisant, Jon Froehlich, Leah Findlater, Yiyan Liu
Take Away Messages
Create effective node-link visualizations in NodeXL:
• Readability metrics to guide improvements
• Motif simplification to reduce complexity
• Meta-layouts to more clearly show ties and groups
Cody Dunne
cdunne@cs.umd.edu
www.cs.umd.edu/~cdunne/
Download