Watson Innovations Cognitive Visualization Lab Readability metric feedback for aiding nodelink visualization designers Cody Dunne August 10, 2015 Graph Summit 2015 cdunne@us.ibm.com ibm.biz/cogvislab Cody Dunne, PhD – Cognitive Visualization RSM Web: ibm.biz/codydunne epidemiology/dynamic networks Email: cdunne@us.ibm.com aggregation techniques literature exploration news term occurrence computer network traffic layout readability exploration provenance network type overviews group/set visualization Watson Graph Readability The team Daniel Weidele Research Intern CVL Mauro Martino Manager CVL Steve Ross STSM CVL Ben Shneiderman Professor UMD Why Visualization? Anscombe’s quartet – Table I x II y x III y x IV y x y 10.00 8.04 10.00 9.14 10.00 7.46 8.00 6.58 8.00 6.95 8.00 8.14 8.00 6.77 8.00 5.76 13.00 7.58 13.00 8.74 13.00 12.74 8.00 7.71 9.00 8.81 9.00 8.77 9.00 7.11 8.00 8.84 11.00 8.33 11.00 9.26 11.00 7.81 8.00 8.47 14.00 9.96 14.00 8.10 14.00 8.84 8.00 7.04 6.00 7.24 6.00 6.13 6.00 6.08 8.00 5.25 4.00 4.26 4.00 3.10 4.00 5.39 19.00 12.50 12.00 10.84 12.00 9.13 12.00 8.15 8.00 5.56 7.00 4.82 7.00 7.26 7.00 6.42 8.00 7.91 5.00 5.68 5.00 4.74 5.00 5.73 8.00 6.89 Why Visualization? Anscombe’s quartet – Statistics & Visualization Property in Each Case Value Equality Mean of x 9 Exact Variance of x 11 Exact Mean of y 7.50 2 decimal places Variance of y 4.122 or 4.127 3 decimal places Correlation between x & y 0.816 3 decimal places Linear regression line 2&3 y = 3.00 + decimal 0.500x places Why Visualization? Tukey No catalogue of techniques can convey a willingness to look for what can be seen, whether or not anticipated. Yet this is at the heart of exploratory data analysis. ... the picture-examining eye is the best finder we have of the wholly unanticipated. – Tukey, 1980 Node-Link Graph Visualization General Graph ≈ Network Node ≈ Vertex ≈ Entity Edge ≈ Link ≈ Relationship ≈ Tie Node 1 Node 2 Alice Bob Alice Cathy Cathy Alice Watson Graph Readability Comparing two popular layout algorithms D3.js Force Layout GraphViz SFDP Watson Graph Readability Immense variation in layout readability and speed Hachul & Jünger, 2006 Watson Graph Readability Evaluate, compare, and improve layouts Node Overlap How much of the underlying network structure can you understand from a given layout? Edge Crossings Edge Crossing Angle Watson Graph Readability Measuring Readability Simple rules or heuristics Davidson & Harel, 1996 Global readability metrics Purchase, 2002 User performance Huang et al., 2007, etc. Source: Sugiyama, 2002, p. 14 Our metrics New Local Node overlap Edge tunnel Drawing space used Group overlap Distance Coherence Edge crossing Angular resolution Edge crossing angle Stress Existing metrics Watson Graph Readability Node Overlap RM a area bounds n j n N j Global readability metric [0,1] where: 0 = Complete overlap 1 = No overlap amax area bounds n j n j N a argmax area bounds n j n j N n divOrZero(a a , amax a ) Node readability metric Ratio of node area that overlaps other nodes * a (n j ) area bounds(n j ) bounds( nk ) n n N k j amax (n j ) area(bounds( n j )) n ( n j ) 1 divideOrZero( a ( n j ), amax ( n j )) Watson Graph Readability Edge Crossing RM m m 1 call i 1 2 i 1 1 cimpossible deg(nj ) deg nj 1 2 nj N m Global readability metric [0,1] where: 0 = All possible crossings 1 = No crossings cmax call cimpossible c 1 divOrZero(c, cmax ) call (ei) m 1 cimpossible (ei) deg src ei deg tar ei 2 Edge readability metric Just like gobal RM cmax (ei) call (ei) cimpossible (ei) m deg src ei deg tar ei 1 c (ei) 1 divOrZero(c(ei), cmax (ei)) Watson Graph Readability Edge Crossing RM (continued) Node readability metric [0,1] where: 0 = All possible crossings 1 = No crossings c( nj ) c( ei) eiedges nj cmax ( nj ) cmax ( ei) eiedges nj m 1 deg src ei deg tar ei m 1 deg nj deg adj nj , ei eiedges nj eiedges nj deg(nj )(m 1 deg(nj )) deg adj nj , ei edges(n nj )) c (nj ) 1 divOrZero(c(nj ),eicmax j Watson Graph Readability Goal Evaluate, compare, and improve layouts • Layout algorithm heuristics and parameters • User-generated or user-modified layouts • Manual layout suggestions a la snap-to-grid • Fully automatic layouts • Recommend layouts and parameterizations Watson Graph Readability Layout algorithm & design comparison interface Watson Graph Readability Machine learning to identify best layout • Train a model M(G,S(G),L,P(L))->(RM,UO) • Graph G with statistics S(G), layout algorithm L and parameters P(L) • Readability metrics for L on G with P(L) • argmax_{L,P(L)|(G?),S(G),RM'} M, with RM' ⊆ RM as the optimal layout Watson Graph Readability Use in practice • Need interface on top of your graph store – Data cleaning, process sanity check – Exploration • Must be able to evaluate effectiveness • Works with aggregate views Watson Graph Readability Discussion • • • • • Raise awareness of readability issues Localized identification of where improvement is needed Optimization recommendations for tasks Interactive optimization Future optimization plans Dunne C, Ross SI, Shneiderman B, and Martino M (2015), “Readability metric feedback for aiding node-link visualization designers”. IBM Journal of Research and Development. Dunne C and Shneiderman B (2009), "Improving graph drawing readability by incorporating readability metrics: A software tool for network analysts". University of Maryland. Human-Computer Interaction Lab Tech Report No. (HCIL-2009-13). Project Information Organization Watson Team Size 4 Path to Market Research Asset IBM S/W SoftLayer, Bluemix Open Source Java Topology Suite and/or JavaScript Topology Suite, Java EE, MySQL, EmpireDB, JUNG, JQuery