Lecture 1-3 Modularity Maximization Weili Wu Ding-Zhu Du University of Texas at Dallas lidong.wu@utdallas.edu Model-Based Detections • • • • • Connection-based detection Modularity maximization Influence-based detection Overlapping community detection Hierarchy community detection 2 Model-Based Detection Modularity Maximization Is the most popular one 3 Popularity • M.E. J. Newman: Modularity and community structure in networks, Proceedings of the National Academy of Sciences, vol 103 no 23 (2006) pp. 8577-8582. (cited by 3550) • M.E. J. Newman, M. Girvan, Finding and evaluating community structure in networks Physical review E (2004). (cited by 6505) 4 Outline Modularity Function Greedy Spectral Method and MP 5 Modularity Function (Newman and Girvan 2004) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define ki k j 1 Q aij C (i ),C ( j ) 2 | E | i , jV 2 | E | where ki is the degree of node i, C (i ),C ( j ) is the Kronecher delta symbol, C (i ) is the community where node i is located. 6 Modularity Function Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C (C1 ,..., C K ) of V , define ki k j 1 Q aij C (i ),C ( j ) 2 | E | i , jV 2 | E | ki k j 1 aij 2 | E | C ( i ) C ( j ) 2 | E | in out 2 ( 2 | E | | E 1 Ck C k |) in 2 | ECk | 2 | E | Ck 2| E | | E in | 2 | E in | | E out | 2 C Ck Ck k 2| E | Ck | E | 7 Modularity Function Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C (C1 ,..., C K ) of V , define ki k j 1 Q aij C (i ),C ( j ) 2 | E | i , jV 2 | E | | E in | 2 | E in | | E out | 2 C Ck Ck k 2 | E | Ck | E | This is the total difference of the fraction of the edges within a community minus the expected number of such fraction if edges were distribute d at random. 8 Modularity Function (Newman anf Girvan 2004) Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition (C1 , C2 ,..., C K ) of V , define L(C , C ) L(C , C ) L(C , C ) 2 s s s s s s Q L(V ,V ) s 1 L (V , V ) where L(U , W ) aij . K iU , jW 9 Modularity Function (digraph) Consider a directed graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , define in out k 1 i kj Q aij C (i ),C ( j ) 2 | E | i , jV 2 | E | where kiin and kiout are in - and out - degree of node i and Ci ,C j is the Kronecher delta symbol. This is the total difference of the fraction of the edges within a community minus the expected number of such fraction if edges were distribute d at random. 10 Why call Modularity? • Module = community in some complex networks • The function describes the quality of modules. 11 Modularity Max is NP-hard • U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner: On modularity clustering, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol 20, no 2 (2008) pp 172-188 12 Outline Modularity Function Greedy Spectral Method 13 Increment Consider a graph G (V , E ) with adjacency matrix (aij ). Given a partition C of V , the modularity function is | E in | 2 | E in | | E out | 2 C Ci Ci Q i 2 | E | Ci | E | When community Ci and C j are merged, the increment of Q is | ECi ,C j | | ECi || EC j Ci C j Q 2 2 2 | E | 4 | E | | 14 Greedy Algorithm input a graph G (V , E ); U 1 {{v} | v V }; for k 1 to n 1 do choose Ci and C j from U to maximize Ci C j Q and k U k 1 (U k {Ci , C j }) {Ci C j }; k * arg max Q(U k ) 1 k n output U k * 15 Outline Modularity Function Greedy Spectral Method and MP 16 Qualified Cut Given a graph G (V , E ), find a subset S of V to maximize Q ( S , S ). Community Partition Apply the Qualified Cut to each part of current partition until value of Q cannot be increasd. 17 Quadratic Form ki k j 1 Q aij Ci ,C j 2 | E | i , jV 2 | E | ki k j 1 ( aij si s j 1) 2 | E | i , jV 2 | E | ki k j 1 aij si s j 2 | E | i , jV 2 | E | 1 T s Bs 2| E | 1 if i is in group 1 si - 1 if i is in group 2 18 Spectral Method 1 T Q s Bs 2| E | achieves the maximum when s is parallel to the eigenvecto r of the largest eigenvalue . 19 Linear Program 1 max Bij (1 xij ) 2 | E | i, j s.t. xik xij x jk for all i, j , k xij {0,1} for all i, j 0 if i and j are in the same community xij 1 if i and j are in different communitie s 20 Vector Program 1 max Bij (1 si s j ) 2 | E | i, j s.t. si2 1 for all i Semi-definite Program 21 Remark 1 How to evaluate the method for finding a community? 22 Clustering 23 Community Detection 24 Remark 2 How to do overlapping community detection? How to do hierarchy community detection? 25 Survey • Introductory review: Communities in networks by M. A. Porter, J.-P. Onnela, and P. J. Mucha, Notices of the American Mathematical Society 56, 1082 (2009) • Comprehensive review: Community detection in graphs by Santo Fortunato, Physics Reports 486, 75 (2010) 26 THANK YOU!