day1-3 - The University of Texas at Dallas

advertisement
Lecture 1-3
Modularity Maximization
Weili Wu Ding-Zhu Du
University of Texas at Dallas
lidong.wu@utdallas.edu
Model-Based Detections
•
•
•
•
•
Connection-based detection
Modularity maximization
Influence-based detection
Overlapping community detection
Hierarchy community detection
2
Model-Based Detection
Modularity Maximization
Is the most popular one
3
Popularity
• M.E. J. Newman: Modularity and community
structure in networks, Proceedings of the
National Academy of Sciences, vol 103 no 23
(2006) pp. 8577-8582. (cited by 3550)
• M.E. J. Newman, M. Girvan, Finding and
evaluating community structure in networks
Physical review E (2004). (cited by 6505)
4
Outline
 Modularity Function
 Greedy
 Spectral Method and MP
5
Modularity Function
(Newman and Girvan 2004)
Consider a graph G  (V , E ) with adjacency matrix (aij ).
Given a partition C of V , define
ki k j 

1
Q

aij 
 C (i ),C ( j )
2 | E | i , jV 
2 | E |
where ki is the degree of node i,
 C (i ),C ( j ) is the Kronecher delta symbol,
C (i ) is the community where node i is located.
6
Modularity Function
Consider a graph G  (V , E ) with adjacency matrix (aij ).
Given a partition C  (C1 ,..., C K ) of V , define
ki k j 

1
Q

aij 
 C (i ),C ( j )
2 | E | i , jV 
2 | E |
ki k j 

1


aij 

2 | E | C ( i ) C ( j ) 
2 | E |
in
out 2


(
2
|
E
|

|
E
1
Ck
C k |)
in

2 | ECk | 


2 | E | Ck 
2| E |

 | E in |  2 | E in |  | E out |  2 
C
Ck
Ck
 
  k 

 
2| E |
Ck  | E |

 

7
Modularity Function
Consider a graph G  (V , E ) with adjacency matrix (aij ).
Given a partition C  (C1 ,..., C K ) of V , define
ki k j 

1
Q

aij 
 C (i ),C ( j )
2 | E | i , jV 
2 | E |
 | E in |  2 | E in |  | E out |  2 
C
Ck
Ck
 
  k 

 
2
|
E
|
Ck  | E |

 

This is the total difference of the fraction of the edges
within a community minus the expected number of
such fraction if edges were distribute d at random.
8
Modularity Function
(Newman anf Girvan 2004)
Consider a graph G  (V , E ) with adjacency matrix (aij ).
Given a partition (C1 , C2 ,..., C K ) of V , define
 L(C , C )  L(C , C )  L(C , C )  2 
s
s
s
s
s
s
 
Q  
 
L(V ,V )
s 1  L (V , V )

 

where L(U , W )   aij .
K
iU , jW
9
Modularity Function
(digraph)
Consider a directed graph G  (V , E ) with adjacency matrix (aij ).
Given a partition C of V , define
in out


k
1
i kj
Q
aij 
 C (i ),C ( j )

2 | E | i , jV 
2 | E | 
where kiin and kiout are in - and out - degree of node i and  Ci ,C j is
the Kronecher delta symbol.
This is the total difference of the fraction of the edges within a
community minus the expected number of such fraction if edges
were distribute d at random.
10
Why call Modularity?
• Module = community in some complex
networks
• The function describes the quality of
modules.
11
Modularity Max is NP-hard
• U. Brandes, D. Delling, M. Gaertler, R. Gorke,
M. Hoefer, Z. Nikoloski, and D. Wagner: On
modularity clustering, IEEE Transactions on
Knowledge and Data Engineering (TKDE), vol
20, no 2 (2008) pp 172-188
12
Outline
 Modularity Function
 Greedy
 Spectral Method
13
Increment
Consider a graph G  (V , E ) with adjacency matrix (aij ).
Given a partition C of V , the modularity function is
 | E in |  2 | E in |  | E out |  2 
C
Ci
Ci
 
Q   i 

 
2
|
E
|
Ci  | E |

 

When community Ci and C j are merged, the increment of Q is
 | ECi ,C j | | ECi || EC j
 Ci C j Q  2

2
2
|
E
|
4
|
E
|

|



14
Greedy Algorithm
input a graph G  (V , E );
U 1  {{v} | v  V };
for k  1 to n  1 do
choose Ci and C j from U to maximize  Ci C j Q and
k
U k 1  (U k  {Ci , C j })  {Ci  C j };
k *  arg max Q(U k )
1 k  n
output U k *
15
Outline
 Modularity Function
 Greedy
 Spectral Method and MP
16
Qualified Cut
Given a graph G  (V , E ), find a subset S of V
to maximize Q ( S , S ).
Community Partition
Apply the Qualified Cut to each part of current
partition until value of Q cannot be increasd.
17
Quadratic Form
ki k j 

1
Q

aij 
 Ci ,C j
2 | E | i , jV 
2 | E |
ki k j 

1

(

aij 
 si s j  1)
2 | E | i , jV 
2 | E |
ki k j 

1


aij 
 si s j
2 | E | i , jV 
2 | E |
1 T

s Bs
2| E |
 1 if i is in group 1
si  
 - 1 if i is in group 2
18
Spectral Method
1 T
Q
s Bs
2| E |
achieves the maximum when s is parallel to
the eigenvecto r of the largest eigenvalue .
19
Linear Program
1
max
Bij (1  xij )

2 | E | i, j
s.t. xik  xij  x jk for all i, j , k
xij  {0,1} for all i, j
0 if i and j are in the same community
xij  
 1 if i and j are in different communitie s
20
Vector Program
1
max
Bij (1  si s j )

2 | E | i, j
s.t. si2  1 for all i
Semi-definite Program
21
Remark 1
How to evaluate the method
for finding a community?
22
Clustering
23
Community Detection
24
Remark 2
How to do overlapping
community detection?
How to do hierarchy
community detection?
25
Survey
• Introductory review: Communities in
networks by M. A. Porter, J.-P. Onnela, and P. J.
Mucha, Notices of the American Mathematical
Society 56, 1082 (2009)
• Comprehensive review: Community
detection in graphs by Santo Fortunato, Physics
Reports 486, 75 (2010)
26
THANK YOU!
Download