Visualization of Graph Data CS 4390/5390 Data Visualization Shirley Moore, Instructor October 6, 2014 1 Graphs and Trees • • • • • • • • • • graph – a set of nodes (vertices) connected by links (edges) Links can be directed or undirected. Two nodes are adjacent if they are connected by a link. Two edges are adjacent if they share a common node. Nodes and links can both have attributes. A path from node a to node b is a sequence of adjacent edges from a to b. A cycle is a path that begins and ends at the same node. A graph is connected if there exists a path between any two nodes. A tree is a connected acyclic graph. If there are n nodes, what is the maximum number of links – – – – in a directed graph? in an undirected graph? in a directed tree? in an undirected tree? 2 Graph Analytics Slide courtesy of John Feo, PNNL 3 Scientific Grids vs. Data Informatics Graphs Slide courtesy of John Feo, PNNL 4 Slide courtesy of Mathieu Bastian 5 National Security Slide courtesy of Mathieu Bastian 6 Public Health 7 Small Graphs 8 Medium Graphs 9 Large Graphs • http://snap.stanford.edu/data/index.html 10 Implicit vs. Explicit 11 Graph Analytics 12 Idiom Choices 13 Triangular-vertical node-link layout • What: Tree dataset • Why: Hierarchical relationships, topology analysis tasks • How: Vertical spatial position shows depth in tree, horizontal spatial position is artifact of layout algorithm • Scale: A few dozen nodes 14 Spline-radial Layout • What: tree dataset • How: – Depth encoded as distance from center of circle – Links drawn as smoothly curving splines – Reingold-Tilford layout algorithm • Scale: A few hundred nodes • Example written in D3: – http://bl.ocks.org/mbostock/4063550 15 D3 Tree Layout • http://www.d3noob.org/2014/01/tree-diagrams-ind3js_11.html • https://github.com/mbostock/d3/wiki/TreeLayout • Representative of the D3 hierarchy layout – https://github.com/mbostock/d3/wiki/HierarchyLayout • Produces node-link diagrams of trees using the Reingold-Tilford “tidy” algorithm • Can input data that is in JSON (JavaScript Object Notation) format 16 Brainstorming Exercise 1 • How could we scale tree layouts to more than a few hundred nodes? – Possible strategy: use 3D • Why or why not? 17 Collapsible Tree Layout • Example in D3 – http://bl.ocks.org/mbostock/4339083 18 Treemap Examples: • http://bl.ocks.org/mbostock/4063582 • http://bost.ocks.org/mike/treemap/ 19 General Graph Layouts • Also called network layouts • Do not directly use spatial position to encode attribute values • Layout algorithms try to minimize number of edge crossings and node overlaps. • May use size and color encodings for node and link attributes 20 Force-Directed Placement • Widely used for node-link network layout • Position network elements according to a simulation of physical forces – e.g., – Nodes push away from each other – Links act like springs that draw their endpoints closer • Can start by placing nodes randomly and iterating to gradually improve layout • Disadvantages – – – – Clusters may be artifacts of algorithm Layout may be nondeterministic May get stuck in local minimum energy configuration Doesn’t scale past a few hundred nodes 21 What-Why-How for Force-Directed Placement 22 Scalable Force-Directed Placement (sfdp) • Multilevel approach that transforms network into hierarchy of successively simpler networks • Algorithm: Layout coarsest network first, then improve layout with more and more complex versions • Examples: http://yifanhu.net/GALLERY/GRAPHS/index.html • Graphviz software: http://www.graphviz.org/ 23 Adjacency Matrix View Example: http://bost.ocks.org/mike/miserables/ 24 Characteristic Patterns in Node-link and Matrix Views 25 Brainstorming Exercise 2 • Which graph analysis tasks are better supported by the node-link view, and which are better supported by the matrix view? • How does the above answer change with increasing size of the graph? 26 Graph Visualization Tools • Sigma.js JavaScript library • Gephi open source graph viz platform • Many more! 27 Preparation for Next Class • Keep working with D3, use the tutorials on the D3.js wiki • Implement interaction in your parallel coordinates visualization for Lab 3 • Decide which datasets to use for Lab 3 • Grad students and extra credit for undergrads: Research k-means clustering 28 Looking Ahead • Quest (Quiz/Test) on Wed, Oct. 15 • Course exam on Wed, Nov. 19 29