Rich Block: Rich Club Analysis with the Erdos-Renyi Mixture Model

advertisement
Rich Block: Rich Club Analysis with the Erdos-Renyi Mixture Model
Soroosh Afyouni1, Dragana Pavlovic1, Emma Towlson2, Petra Vertes3, Theo Arvanitis1, Edward Bullmore3,4, and Thomas Nichols1
1University
of Warwick
2
Northeastern University
1. Introduction
3
4
University of Cambridge
GlaxoSmithKline
4. Evaluation and Validation
en
BI
d
y
tit
BI
d
0.5
0.5
tit
B
en
BI
d
R
R
N
BI
d
od
en
al
R
tit
B
al
R
od
y
0
y
0
0.8
0.3
0.2
Emprical RC
ERMM-RC (q=10)
ERMM-RC (q=20)
ERMM-RC (q=40)
ERMM-RC (q=70)
ERMM-RC (q=100)
0.1
0
0
10
20
30
40
50
60
70
80
90
100
de
Degree/Lambda
Figure3
5. Application: Resting State fMRI Connectivity in
Schizophrenia
Rich Block was used to detect the differenceresting fMRI connectivity between controls
(n=13) and schizophrenic subjects (n=12) (Lynall etal 2010). Wavelet correlation transformation (WCT) (scale 2) was used to form a network, after binarisation to give 10% edge
density, GL-SBM with patient/control, age and movement covariates found Q=31 latent
blocks.
Fig4 Illustrates degree distribution of estimated models for control and patient versus degree distribution of their null models, sorted by descending λ. Despite detectable difference
between degree distribution of estimated models and null models in earlier RBs, they significantly overlaps in further RB until they become identical in the last RB (degree distribution
of last RB (RB31) is degree distribution of whole network. For sake of visualisation, we just
showed 16 RBs below.)
RB 1
RB 2
0.3
RB 3
0.2
RB 4
0.2
0.1
0
0
5
10
RB 7
0.15
0
0
10
0.1
20
RB 8
0.15
0
RB 5
0.15
RB 6
0.15
0.1
0.1
0.05
0.05
0
0
0.1
0
10
20
RB 9
0.1
0
0
20
40
RB 10
0.1
0
50
RB 11
0.1
0
50
RB 12
0.1
0.1
0.05
0.05
0
0
0
50
100
RB 13
0.1
0
50
0.05
0.05
0.05
0.05
0
0
0
0
100
RB 14
0.1
0
50
100
RB 15
0.1
0
0.05
0.05
0.05
0
0
0
0
100
0
50
100
0
100
200
0
100
200
0
100
200
0
100
200
RB 16
0.1
0.05
50
Control RB
Control Null RB
Patient RB
Patient Null RB
14
12
BI
Fig 3 shows an example with the C.Elegans
connectome (N=279 nodes, E=2287 edges,
[Towlson et al, 2013]), with RC’s Φ with thick
black line, and RB’s Φ for various Q’s. There
is close agreement that improves with Q.
0.4
300
250
R
0.5
0.1
Simulated Network
nt
it
B
0.6
0.5
Matrix of Parameters
od
al
R
Figure2
N
Rich Club Coefficient
0.7
0.2
Simulated networks were formed from synthetic probability matrices, Πs (Fig1B). Regardless of network size, 25% of each probability matrix were assigned higher probabilites with
their intra-connections also inflated to ensure that Rich Club/Block will be formed. Twelve
different types of networks were used to conduct the simulations: Networks of size 60, 120
and 300 were selected. For each category, four Q numbers (6,15,30,60) were investigated.
All blocks assumed to have the same size. Three categories were discarded due to low (≥
4) ratio of node to block
Example for N=300, Q=15
y
0
1
3. Simulation Methods
Q= 60
1
0.5
Erdos Renyi Mixture Model (ERMM):
The ERMM decomposes a N × N binary P
network G(N, E) into Q latent blocks. Prevalence
of block ` ∈ {1, ..., Q} is α` , such that ` α` = 1. Edges occur between block q and `
as a Bernoulli random variable with parameter πq` . where Π = ((πql))1≤q,`≤Q . The degree
distribution is then approximately Poisson mixture with
parameter
λ.
Mixing
weights
αandλ
PQ
gives the expected degree of each block:λq = (n − 1) l=1 αlπql
Rich Block (RB):
We propose defining a new version of Rich Club in terms of the ERMM, describing how
a set of Erdos-Renyi blocks with expected degree of k or larger interact with each other.
Fundamental to RB is how the ERMM groups nodes with common a pattern of connectivity.
Hence, all nodes in a given block have the same fitted degree, and thus we treat these ’clubs’
as the basic units of the RC. The ERMM implies the statistical distribution of RB degree
(again, a Binomial mixture); with the GL-SBM, Π and α can be found for group data, not
just single subjects.
normalisation and inference of RB coefficients is done with a null model. In contrast to RC,
we can estimate null Π while holding fixed the node-to-block assignments fixed. This allows
computationally efficient sampling from the null, as the Binomial mixture can be directly
sampled from the fit to one random sample.
Q= 30
R
1
0.9
Generalised Linear Stochastic Block Model (GL-SBM):
Fitting the ERMM to multi subjects will produce different block structures in each subject.
In [Poster 3861 WT-PM] we introduce the GL-SBM that finds a common set of Q blocks,
allowing subject connectivity to differ as per covariates in a logistic regression.
Q= 15
R
en
en
BI
d
od
N
1
Non-normalised Rich Club of Hermaphrodite CElegans - 1k restars for each Q
1
Rich Club (RC):
Rich Club measures how densly nodes of degree k are connected to other nodes with degree
k or larger. Normalised Rich Club coefficient, Φk measures the density of club k divided by
the average density of a null network’s club k. Null samples also permit computation of Pvalues and selection of significant RCs, and are obtained by random but degree-preserving
rewiring (Maslove & Sneppen, 2002).
Q= 6
y
tit
B
al
R
tit
B
al
R
tit
en
BI
d
R
od
en
R
0
R
0
B
0
0.5
al
R
0.5
y
0.5
1
y
1
B
BI
d
od
N
BI
d
R
1
al
R
tit
B
al
R
tit
en
al
R
od
Results suggest that although distance between Nodal RB and Nodal RC remain respectively the same over different number of
nodes and blocks, RB/RC identity values become closer to each other as number of blocks
increase.
0
N
0
od
0
0.5
N
0.5
y
0.5
N= 300
1
y
1
N
• Node-wise RB/RC labelling (Identity). Node i is given a value
of 1 if it belongs to the largest RB/RC found significant at level
5%, 0 otherwise. We compare RB Identity to RC Identity with
Adjusted Rand Index (ARI).
2. Methods
1
B
• Node-wise RB/RC coefficient (Nodal RB/RC). The nodal coefficient for node i is the Φk of the smallest RC or RB containing
i. We compare NodalRB to NodalRC with R 2.
N= 120
N
• Forming clubs (block) that not only share the same degree but also share the same pattern
of connectivity.
We use two approaches to compare RB with
traditional RC:
od
• Provide an approach to multi-subject analysis of Rich Club.
N= 60
N
Rich Club (RC) analysis is one way to gain insight to the structure of the human connectome.
One impediment to general use of the RC is difficulty with multi-subject data. In other work
we have shown that stochastic block models like the Erdos-Renyi Mixture Model (ERMM)
are useful for capturing of biological relevant group structure,in single networks (Pavlovic
et al 2014), and in recent work we have extended this to find common network structure
in multiple subjects (General Linearised Stochastic Block Model), GL-SBM, (Pavlovic etal,
2015 - See poster OHBM2015 3861 WT-PM). Here we use the GL-SBM to develop a new
metric called Rich Block (RB), a stochastic block model implementation of RC, with two main
goals:
0
100
200
200
10
Figure 4
8
150
4
50
2
2
4
6
8
10
12
14
50
Monte Carlo Test
100
150
200
250
300
Kullback-Leibler Divergence
5
14
RB
Detected RB
4.5
12
4
10
Distance from Null Model
-log10 p-values
3.5
3
2.5
2
1.5
10
15
0
0
5
0
5
10
15
20
25
30
Blocks
60
40
20
0
0
5
10
15
Blocks
1.2
1
Blocks
Patient Expected Degrees
80
1.4
20
25
30
2
1.8
1.6
1.4
1.2
1
Blocks
Figure 6
10
15
Rich Blocks
Conclusions
References
Heuvel and Sporns. Journal of Neuroscience, 2011 — Lynall et al, Journal of Neuroscience, 2010 — Maslov and Sneppen, Science,
2002 — Pavlovic et al, PloS One, 2014 — Towlson et al, Journal of Neuroscience, 2013
S.Afyouni@Warwick.ac.uk
0
1.6
Fig6 shows block-wise RB coefficients, for patients and controls. A Monte Carlo (parametric Boostrap) test was used to compare Φ between populations. The blocks which are
significantly larger was assigned by a red circle at the top of it.
2
Rich Blocks
20
1.8
6
0.5
5
40
Figure 5
4
0
Rich Block
60
2
8
1
0
Control Expected Degrees
80
Control Block-wise RB
100
Patient Block-wise RB
6
Expected Degree Expected Degree
Fig5 shows detected RBs in red on unsorted expected degree bar plots for patients and
controls.
http://warwick.ac.uk/tenichols/ohbm
In this study we have proposed a new metric called Rich Block which is combination of
stochastic block model and empirical rich block. This approach simultaneously identifies a
simplified or ’compressed’ network structure while identifying the ’richly connected’ components of this structure, and facilitating group comparisons.
Download