CS8803-NS
Network Science
Fall 2013
Instructor: Constantine Dovrolis constantine@gatech.edu
http://www.cc.gatech.edu/~dovrolis/Courses/NetSci/
The following slides include only the figures or videos that we use in class; they do not include detailed explanations, derivations or descriptions covered in class.
Many of the following figures are copied from open sources at the Web. I do not claim any intellectual property for the following material.
• What does “network community” mean?
• Community detection versus graph partitioning versus hierarchical clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph Laplacian)
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Variations of the community detection problem
– Overlapping communities
– Dynamic communities
– Link-based communities
• Properties of real-world network communities
• Applications of community detection
– In social networks
– In biological networks
– In brain networks
– In ecological networks
– In climate networks
• What does “network community” mean?
• Community detection versus graph partitioning versus hierarchical clustering
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph
Laplacian)
• What does “network community” mean?
• Community detection versus graph partitioning versus hierarchical clustering
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph
Laplacian)
Graph partitioning vs Community detection
• In graph partitioning, the desired number and size of the partitions is given
– E.g., graph bisection in two equal-sized partitions
– NP-Hard
• In community detection, the number of communities (and their size) results from the method itself
– It is a property of the network
• The community detection problem is less well-defined than the graph partitioning problem
Spectral bisection method for graph partitioning
(see last few slides for more details)
Graph partitioning vs Hierarchical clustering
Community detection vs Hierarchical clustering
• Hierarchical clustering comes in two forms:
– Divisive algs: top-down
– Agglomerative: bottom-up
• Key points:
– Need a similarity metric for any two nodes
• Which metric to use?
• How to examine similarity of groups of nodes?
– Which horizontal partition gives more insight?
– Some clusters are artificial; not “real communities”
– Fundamentally, many networks are NOT hierarchical
• What does “network community” mean?
• Community detection versus graph partitioning
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph Laplacian)
• Fraction of edges between pairs of nodes that belong to the same community
RELATIVE TO
• Fraction of edges between same pair of nodes if edges were placed randomly (but in a degree-preserving manner)
Spectral maximization of modularity
(2006)
Spectral maximization of modularity
(see class notes for detailed derivations)
Spectral maximization of modularity
(see class notes for detailed derivations)
Dividing a community into smaller communities
Spectral maximization of modularity
(see class notes for detailed derivations)
Greedy optimization of modularity (2004)
Complexity of Clauset et al.’s method
• What does “network community” mean?
• Community detection versus graph partitioning
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph Laplacian)
The algorithm of Girvan-Newman
The algorithm of Girvan-Newman
The algorithm of Radicchi et al.
• What does “network community” mean?
• Community detection versus graph partitioning
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph Laplacian)
http://condor.depaul.edu/ntomuro/courses/578/notes/notes-Clustering.html
Hierarchical agglomerative clustering http://condor.depaul.edu/ntomuro/courses/578/notes/notes-Clustering.html
http://mines.humanoriented.com/classes/2010/fall/csci568/portfolio_exports/mvoget/cluster/cluster.html
Cluster similarity – 3 approaches
• What does “network community” mean?
• Community detection versus graph partitioning
• Modularity metric for community detection
– Spectral-based modularity optimization
– Other methods for modularity optimization
• Community detection methods that do not rely on modularity metric
– Betweenness-Centrality method
– Radicchi et al. method
• Hierarchical agglomerative clustering
• Graph partitioning algorithms
– Spectral partitioning (Fiedler’s method based on graph Laplacian)
Key points
(See class notes for detailed derivations)
• Define Laplacian of an (undirected, unweighted) graph
– Show that all eigenvalues of Laplacian are non-negative
– Show that Laplacian has at least one zero eigenvalue
– The number of zero eigenvalues is equal to the number of connected components in the graph
– The lowest non-zero eigenvalue is called
“algebraic connectivity” and it is proportional to the graph’s min cut set
Key points (cont’)
(See class notes for detailed derivations)
• Formulate graph bisection problem as a constrained optimization problem
• Show that min cut set is proportional to algebraic connectivity (min non-zero eigenvalue of Laplacian)
• Compute corresponding eigenvector
(appropriately normalized)
• And determine graph partitions based on the values of that eigenvector
• For a sparse graph, this method is O(n 2 )
– If the second eigenvector is computed using the orthogonalization or Lanczos method (which is
O(m*n))