Expanding and Evaluating Network Motif Simplification

advertisement
Expanding and Evaluating
Network Motif Simplification
Cody Dunne and Ben Shneiderman
{cdunne, ben}@cs.umd.edu
30th Annual Human-Computer Interaction Lab
Symposium, May 22–23, 2013 College Park, MD
Who Uses Network Analysis
Sociology
Scientometrics
Biology
Urban
Planning
Politics
Archaeology
WWW
Network Analysis is Hard
Better Layouts
Alternate Visualizations
Hachul & Jünger, 2006
Gove et al, 2011; Henry & Fekete, 2006;
Dunne et al, 2012; Freire et al, 2010;
Wattenberg, 2006
Graph Summarization
Navlakha et al, 2008
Lostpedia articles
Observations
1: There are repeating patterns in
networks (motifs)
2: Motifs often dominate the
visualization
3: Motifs members can be
functionally equivalent
Motif Simplification
Fan Motif
2-Connector Motif
Lostpedia articles
Lostpedia articles
Glyph Design: Fan
Glyph Design: Connector
Glyph Design: General D-Connector
Connector Variants
Cliques too!
Senate Co-Voting: 65%
Agreement
Senate Co-Voting: 70%
Agreement
Senate Co-Voting: 80%
Agreement
Interactivity
Fan motif: 133 leaf vertices
with head vertex “Theory”
Interactivity in NodeXL
Motif Detection Algorithms
• Fans
• O(N)
• Connectors
• Handle overlapping connectors
• O(N * avg. neighbor count)
• Cliques
• Overlapping clique-finding algorithms
• Tomita et al., 2006; Eppstein & Strash, 2011
• Choice heuristics
• O(3^(N/3))
Controlled Experiment
60
50
40
30
20
***
10
0
Web
***
Wiki
***
Wiki
Plain
Clique Time
60
50
-1
40
30
*** 20
10
0
Web
Time (s)
60
50
40
30
20
10
0
Connector Time
Time (s)
Time (s)
Fan Time
MS
-3
-3
***
Senate
***
Web
Connector Acc.
80%
80%
60%
40%
60%
40%
0%
0%
0%
Web
Connector Error
1500%
***
Error
***
-50%
1000%
500%
0%
-500%
-100%
*
-1000%
Wiki
Web
Wiki
Web
Senate
Web
Clique Error
Error
50%
0%
Wiki
MS
***
40%
20%
Web
Plain
60%
20%
Fan Error
***
100%
***
80%
20%
Wiki
Error
Clique Acc.
Accuracy
100%
***
100%
Accuracy
Accuracy
Fan Acc.
4000%
3000%
2000%
1000%
0%
-1000%
-2000%
**
Senate
Web
Plain
60
50
40
30
20
10
0
Node Count Error
MS
50%
***
***
Error
Time (s)
Node Count Time
0%
***
***
-50%
-100%
Wiki
Senate
Web
Wiki
Senate
Web
Cutpoint Accuracy
Cutpoint Time
100%
60
50
40
30
20
10
0
80%
Accuracy
Time (s)
***
**
60%
40%
20%
0%
Wiki
Web
Wiki
Web
Plain
60
50
40
30
20
10
0
Plain Label Accuracy
MS
100%
-12
-3
***
***
Wiki
Senate
-18
60%
40%
0%
Web
Wiki
Senate
Web
Hidden Label Accuracy
100%
-13
-10
-5 **
-4
-16
-17
80%
Accuracy
Time (s)
***
20%
Hidden Label Time
60
50
40
30
20
10
0
***
80%
Accuracy
Time (s)
Plain Label Time
60%
40%
20%
0%
Wiki
Senate
Web
Wiki
Senate
Web
Plain
Path Error
MS
60
50
40
30
20
10
0
*
100%
Error
Time (s)
Path Time
*
0%
-50%
Senate Web1 Web2
Wiki Senate Web1 Web2
***
**
Nbr. Compare Accuracy
Accuracy
Nbr. Compare Time
Time (s)
*
**
Wiki
60
50
40
30
20
10
0
50%
100%
80%
60%
40%
20%
0%
*
*
Shared Nbr. Count Error
MS
60
50
40
30
20
10
0
1500%
1000%
-1
***
Error
Time (s)
Shared Nbr. Count Time Plain
0%
-500%
*
-1000%
Wiki
Web1
Web2
Wiki
Shared Nbr. Comp. Time
60
50
40
30
20
10
0
Web1
Web2
Shared Nbr. Comp. Accuracy
100%
80%
Accuracy
Time (s)
500%
60%
40%
20%
0%
Wiki
Web
Wiki
Web
Motif Simplification Discussion
• Motif simplification effective for
• Reducing complexity
• Understanding larger or hidden relationships
• However
• Frequent motifs may not be covered
• Glyph design has tradeoffs
• May be challenging at first for some tasks
• Available now in NodeXL: nodexl.codeplex.com
Funders
• Social Media Research Foundation
• Connected Action Consulting Group
• National Science Foundation grant SBE 0915645
Motif Simplification
Papers
• Dunne C & Shneiderman B
(2013). Motif simplification:
improving network
visualization readability with
fan, connector, and clique
glyphs. In CHI '13.
• Shneiderman B & Dunne C
(2012). Interactive network
exploration to derive insights:
Filtering, clustering,
grouping, and simplification.
In Graph Drawing '12.
Cody Dunne, Ben Shneiderman
{cdunne,ben}@cs.umd.edu
Available now in
NodeXL:
nodexl.codeplex.com
Motif Detection Algorithms
• Fans
• O(N)
• Connectors
• Handle overlapping connectors
• O(N * avg. neighbor count)
• Cliques
• Overlapping clique-finding algorithms
• Tomita et al., 2006; Eppstein & Strash, 2011
• Choice heuristics
• O(3^(N/3))
Fan Motif
Detection
O(n)
Connector
Detection
O(n*avg deg)
D-min: 2
D-max: ∞
found:
• a, b -> c
• a, b -> c, d
• a, b -> c, d, e
Connector Problems
D-min: 2
D-max: ∞
found:
• a, b -> d
D-min: 2
D-max: ∞
found:
• a, b -> c, d, e
• c, d, e -> a, b
Connector
Motif
Detection (2)
Clique Motif
Detection
• Find cliques using Tomita et al., 2006
• O(3^(N/3)) time
• High memory cost for matrix multiplication
• Alternatively use Eppstein & Strash, 2011
• Faster for large, sparse networks
• Slower for dense networks within constant factor
• Linear memory cost
• Greedy heuristic
• Choose largest non-overlapping cliques
Controlled Experiment
• Participants: 2 pilot, 36 main
• Data: The Wiki, Senate, and Web networks
• Two groups: control and motif simplification
• 31 questions
• 45 minutes
Controlled Experiment - Tasks
Based on Lee et al. 2006 taxonomy:
1. Node count: About how many nodes are in the network?
2. Cut point: Which individual node would we remove to disconnect the most
nodes from the main network?
3. Largest motif & size: Which is the largest ( fan | connector | clique ) motif
and how many nodes does it contain?
4. Labels: Which node has the label “XXX”?
5. Shortest path: What is the length of the shortest path between the two
highlighted nodes?
6. Neighbors: Which of the two highlighted nodes has more neighbors?
7. Common Neighbors: How many common neighbors are shared by the
two highlighted nodes?
8. Common Neighbors: Which of these two pairs of nodes has more
common neighbors?
Controlled Experiment - Results
Task
Time
Accuracy
Error
Cliques
-20.8s***
0% vs -28%**(Senate) 0 vs. 3
Fan
-7.8s***
Connectors
-17.1s***
21.8s***
(Wiki,Web)
92% vs. 24%***
95% vs. 35%***
(Web)
79% vs. 6%***(Web)
Node Count
Label Plain
-19.9s**
Label Simp.
Cut point
15.3s*(Senate)
Path
10.1s*(Web)
More neighbors
Count shared
neighbors
Compare shared
neighbors
Skipped
2% vs -62%***
-5% vs. -17%*(Wiki)
-8% vs. -47%***
0 vs. 12 Wiki,
3 vs. 18 Web
98% vs. 15%**(Wiki,Web)
53% vs. 18%**(Web)
-7% vs. 22%**(Senate),
20% vs. 1%*(Web)
10.9s**(Wiki),
60% vs. 79%*(Senate,Web)
9.26*(Senate)
17.7s***
-21% vs. -10%* (Wiki)
(Web)
Quantifying Effectiveness
Senate Co-Voting: 85%
Agreement
Senate Co-Voting: 90%
Agreement
Senate Co-Voting: 95%
Agreement
Voson Web Crawl
Voson Web Crawl
Voson Web Crawl
Voson Web Crawl
Voson Web Crawl
Download