Appendix - Proceedings of the Royal Society B

advertisement
Degree Dependence in Rates of Transcription Factor Evolution Explains the
Unusual Structure of Transcription Networks – Supporting Material
Derivation of Equilibrium Degree Distributions
Alexander J Stewart1,2,3, Robert M Seymour1,4 and Andrew Pomiankowski1,2
1
CoMPLEX, UCL, Physics Building , Gower Street, London, WC1E 6BT, U.K.
2
The Galton Laboratory, Research Department of Genetics, Evolution and
Environment, UCL, 4 Stephenson Way, London NW1 2HE, U.K.
3
Department of Mathematics, UCL, Gower Street, London, WC1E 6BT, U.K.
4
Correspondence should be addressed to alex.stewart@ucl.ac.uk
Model
Networks in the model consist of transcription factors (TFs) and target genes (TGs).
TFs regulate other genes and may be regulated by other TFs. TGs are only regulated,
and therefore have only incoming edges. The model includes four distinct types of
mutation – trans mutation, cis mutation, gene duplication and gene deletion. Each
type of mutation is associated with a rate. The mutation rate is the rate at which each
type of mutation becomes fixed in the network (not the rate at which new mutations
actually occur, which may differ considerably from the rate at which they are fixed).
The rates at which different mutations become fixed are referred to as the rates of
evolution. Rates of evolution may be the same for all genes (either TFs or TGs), or
else they may vary with the connectivity of a gene. Initially we consider constant rates
of evolution.
The network is updated by time increments t . These increments are taken to be
sufficiently small so that at most one mutation is fixed in the network within each
increment. We choose a time scalesuch that t  1. The rates of evolution in the
model are therefore the probabilities that a mutation will occur (and be fixed) in a

time interval. Mutations at trans and cis elements effect edges but do not effect nodes
in the network. Gene duplication and deletion effect both nodes and edges. When a
gene is duplicated the network is increased in size by one. When a gene is deleted, the
network is decreased in size by one. Rates of gene duplication and deletion are taken
to be equal. Therefore the expected size of the network remains constant. The actual
size of the network undergoes random variation about some mean size. In our
simulations we place upper and lower limits on the absolute size of the network. At
the upper boundary, gene duplication events are forbidden. At the lower boundary
gene deletions are forbidden. The rates of duplication and deletion may differ between
TFs and TGs.
We calculate the expected equilibrium degree distributions for the in- and out- degree
of networks. The evolution of the in-degree distribution, nin (k,t) , in the mean field
approximation is given by Eq. 4 of the main text:





in
in
in
in
nin  TG
 (k  1)  TF  (k  1) nin (k  1,t)  TG  (k  1)  TF  (k  1) nin (k  1,t) 

in
TG 

in
in
in
(k)  TF
 (k)  TG  (k)  TF  (k) nin (k,t)
, [1A]
in
in
where TG
 (k) and TG  (k) are the probabilities of a gene with in-degree k gaining or
in
in
loosing an edge through a mutation at the gene and TF
 (k) and TF  (k) are the
probabilities of a gene with in-degree k gaining or loosing an edge through trans
mutation, duplication or deletion at a TF in the time interval t. The time evolution of
the out-degree distribution nout (k,t) is given by Eq. 5 of the main text:



out
out
out
nout  TG
 (k  1)  TF  (k  1) nout (k  1,t)  TG  (k  1)nout (k  1,t) 

out
TG 

N
k
jk
j0
out
out
out
out
(k)  TG
 (k)  TF  (k) nout (k,t)   TF  ( j, k)nout ( j,t)   TF  (k, j)nout (k,t)
, [2A]
out
where N is the total number of genes (TFs and TGs) in the network, TG
 (k) and
out
TG
 (k) are the probabilities of a gene with out-degree k gaining or losing an edge
out
through mutation at one of its targets, TF
 (k) is the probability that a TF with out-
out
degree k gains a target through mutation at the TF, and TF
 ( j, k) is the probability
that a TF with out-degree j  k loses interactions to become a TF with out-degree k
due to mutation at the TF.
Setting the left hand side of Eq. 1A to zero, this has equilibrium solution satisfying

in
TF 
To
solve
N



an
approximation
in
in
in
(k  1)  TG
 (k  1) nin (k  1)  TF  (k)  TG  (k) nin (k)
Eq.
2A
we
make
, [3A]
for
the
term
k

out
TF 
jk
out
( j, k)nout ( j,t)   TF
 (k, j)nout (k,t) , which describes loss of interactions
j0
through trans evolution. For the model without degree dependence in the rate of trans
out
evolution (Model 1 and Model 3 in the main text), TF
 ( j, k) is given by
out

TF
 ( j, k)  trans
j!
k!( j  k )!
m j  k (1  m)k . [4A]
By assuming a solution of the form nout (k)  Aout k  , where Aout is a normalization
constant for the out-degree distribution, we can use the approximation
N
k

out
TF 
jk


out
( j, k)nout ( j)   TF
 (k, j)nout (k)
j0

trans out

trans
 1
n (k)(1  m)
2( 1)

 trans
nout (k)  O(k (1  ) )
1  (1  m) (k  1)n
 1
out
. [5A]
(k  1)  (k  1)nout (k  1) O(k (1  ) )
Observe that, when   1 the right hand side of Eq. 5A is zero. To derive this, first
k
note from Eq. 4A that

out
TF 

(k, j)nout (k)  trans
nout (k) . We use Lemma 2 of Chung
j0
N
et al (2003) to show that

out
TF 

( j, k)nout ( j)  trans
nout (k)(1  m) 1  O(k (1  ) ) . To
jk
see this, we have
N

out
TF 
( j, k)nout ( j)  Aout 
jk

trans

jk
k 
 m (1  m) j
jk  j  k 
N


j
N 
j  jk

 Aout trans
k   (1  m)k  
m
jk  j  k 
 

 

 Aout trans
k   (1  m)k 1  O(k 1 )
k
j

 j  
jk
m
j

k
jk 



 Aout trans
k   (1  m)k 1  O(k 1 )

N
[6A]
 i k   i
m
i

i0 
N k


1  O(k )

 Aout trans
k   (1  m)k 1  O(k 1 ) (1  m)  k 1

 trans
nout (k)(1  m) 1
1
valid for N  k (in fact, exactly valid in the limit N   ). We now have
N
k
jk
j0


nout (k)(1  m) 1  trans
nout (k)  O(k (1  ) )
 TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans
,[7A]
Using our assumed solution form, we can also write
(k  1)nout (k  1)  knout (k)  knout (k)  (k  1)nout (k  1)  2(1   )nout (k)  O(k ( 1) )
,[8A]
Eq. 7A and Eq. 8A combine to give Eq. 5A. For large k we can neglect terms
O(k (1  ) ) and define K( , m) 
N
k
jk
j0
1
2( 1)
1  (1  m) . This allows us to write
 1

K( , m)(k  1)nout (k  1)  (k  1)nout (k  1)
 TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans
, [9A]
which is the form used in Eqs. 7 of the main text.
For the model with degree dependence in the rate of trans evolution (Model 2 and
out
Model 4 in the main text), TF
 ( j, k) is given by
out

TF
 ( j, k)  trans
j!
m jk (1 m)k
k !( j  k )!
j
. [10A]
Once again assuming a solution of the form nout (k)  Aout k  , we can use the
approximation
N
k
 TFout ( j, k)nout ( j)   TFout (k, j)nout (k)
jk


j0
nout (k )

trans
k

trans
2

(1  m)  trans

1  (1  m) n

out
nout (k )
k
 O(k (2   ) )
. [11A]
(k  1)  nout (k  1) O(k (2   ) )
k
out

To derive this, first note that in this case  TF
 (k, j)nout (k)  trans
j0
use
Lemma
N

out
TF 
jk
2
of

( j, k)nout ( j)  trans
nout (k )
k
(Chung
et

al.

2003)
to
nout (k )
k
. Again we
show
(1  m) 1  O(k 1 ) . To see this, we have
that
N

jk
out
TF 
N 
j 

( j, k)nout ( j)  Aout trans
  j  k  m j  k (1  m)k j (1 )
jk 

N 
j  jk

 Aout trans
k (1  ) (1  m)k  
m
jk  j  k 
 

 



 Aout trans
k (1  ) (1  m)k 1  O(k 1 )
k
j
1 
 j   1 
jk
m
j

k
jk 



 Aout trans
k (1  ) (1  m)k 1  O(k 1 )

N
[12A]
 i  k   1  i
m
i

i0 
N k

 Aout trans
k (1  ) (1  m)k 1  O(k 1 ) (1  m)  k

 trans
nout (k )
k

(1  m) 1  O(k 1 )

valid for N  k (in fact, exactly valid in the limit N   ). We now have
N
k

 TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans
jk
j0
nout (k )
k

(1  m)  trans
nout (k )
k
 O(k (2  ) ) ,
[13A]
Using our assumed solution form, we can also write
nout (k  1)  nout (k)  nout (k)  nout (k  1)  2
nout (k )
k


 O k (2  ) ,[14A]
Eq. 13A and Eq. 14A combine to give Eq. 11A. For large k we can neglect terms


O k (2  ) and higher, and define K( , m) 
N
k
jk
j0
1
2
1  (1  m) . This allows us to write


K( , m)nout (k  1)  nout (k  1), [15A]
 TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans
which is the form used in Eqs. 7 of the main text.
Using Eq. 9A, the solution to Eq. 2A for the out-degree distribution is

 
out
TG 


(k  1)  (k  1)trans
K( , m) nout (k  1)
out
TG 

out

(k)  TF
 (k)  k trans K( , m) nout (k)
, [16A]
for the model excluding degree distribution in the rate of trans evolution. And using
Eq. 15A gives

 
out
TG 


(k  1)  trans
K( , m) nout (k  1)
out
TG 

out

(k)  TF
 (k)  trans K( , m) nout (k)
, [17A]
for the model including degree distribution in the rate of trans evolution.
Solution to Model 1 – No Connectivity Dependence
The solution to Eq. 3A for the in-degree distribution, using the incoming edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nin (k)  Ain





cis
 trans
k
k 
 ,[18A]
D



 1  k   D  cis
 mtrans
D
where Ain is a normalization constant. Following (Chung et al. 2003) we can write
(x  c)
 1  O 1x  x c . [19A]
(x)


For large x, the terms O1x  can be neglected, and hence Eq 18A gives

nin (k)  k   e k , where

  ln 1 
  1


cis
 m trans
D


cis
 trans
D
and
. [20A]
This is Eq. 8 of the main text.
The solution to Eq. 16A for the out-degree distribution, using the outgoing edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nout (k)  Aout





cis
 trans
 k  D    K( , m)  k ,[21A]
trans
 D       K( , m) 
 1  k 

D trans
K ( ,m)
cis
trans
Using Eq. 19A, this can be approximated to
nout (k)  k  e  k , where
  ln
  1



cis
 D trans
K ( ,m)

D trans
K ( ,m)
and


cis
 trans
. [22A]

D  trans
K( , m)
This is Eq. 9 of the main text. This result is only consistent with the assumption that
nout (k)  Aout k  , when   0.
Solution to Model 2 – Degree Dependence in the Rate of trans Evolution
The solution to Eq. 3A for the in-degree distribution, using the incoming edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nIn (k)  Ain





cis
 trans
k 
D


 1  k   D  cis  m
D
k
1
k
 ,[23A]

trans 
N
Where
1
k

j 1
nout ( j )
j
determines the mean rate of trans evolution across the network.
Using Eq. 19A this is approximately
nin (k)  k   e k ,


cis
m
  ln 1 
  1
1
k
where

trans
D


cis
 trans
D

, and
. [24A]
This is Eq. 10 of the main text.
The solution to Eq. 17A for the out-degree distribution, using the outgoing edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nout (k)  Aout





cis
 trans
 trans
K ( , m)

 k  D  k ,[25A]
 

  K ( ,m)
 1  transD     k  D  cis 

D
cis

Using Eq. 19A, this can be approximated to
nout (k)  k  e  k , where
 , and
  ln 1 

cis
D

D  . [22A]



D(  1)  cis
 trans
 trans
K( , m)  1 
 
D  cis


This is Eq. 11 of the main text. This result is only consistent with the assumption that
nout (k)  Aout k  , when   0.
Solution to Model 3 – Preferential Attachment
The solution to Eq. 3A for the in-degree distribution, using the incoming edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nIn (k)  Ain




R
cis
 trans
k
P
k 
 ,[23A]
D  trans



 1  k   D  cis
 mtrans
P
D trans
Using Eq. 19A this is approximately
nin (k)  k   e k , where

  ln 1 
  1


cis
 m trans
P
D  trans

R
cis
 trans
P
D  trans
and
. [24A]
This is Eq. 12 of the main text.
The solution to Eq. 16A for the out-degree distribution, using the outgoing edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nout (k)  Aout


R

cis
 trans
P

D cis
 trans
K ( ,m)
 1  k 

 k  D   P    K( , m)  k ,[25A]
cis
trans
 D       K( , m) 
cis
trans
Using Eq. 19A, this can be approximated to
nout (k)  k  e  k , where
  ln
  1



cis
 D trans
K ( ,m)
P

D cis
 trans
K ( ,m)
and
R

cis
 trans
. [26A]
P

D  cis
 trans
K( , m)
This is Eq. 13 of the main text. This result is only consistent with the assumption that
nout (k)  Aout k  , when   0.
Solution to Model 4 – Degree Dependence and Preferential Attachment
The solution to Eq. 3A for the in-degree distribution, using the incoming edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nIn (k)  Ain



R
cis
 trans

k
P
k 
 ,[27A]
D  trans



 1  k   D  cis
 m 1k trans

P
D  trans
N
Where
1
k

j 1
nout ( j )
j
determines the mean rate of trans evolution across the network.
Using Eq. 19A this is approximately
nin (k)  k   e k ,

  ln 1 
  1
where

cis
m

1
trans
k
P
D trans

R
cis
 trans
P
D  trans

, and
. [28A]
This is Eq. 14 of the main text.
The solution to Eq. 17A for the out-degree distribution, using the outgoing edge event
probabilities from Table 2 (main text), gives an equilibrium degree distribution
nout (k)  Aout


R


cis
 trans
 trans
K ( , m)

 k  D   P  k ,[29A]
cis

 D    
trans
K ( ,m)
 1  D    k
cis

P
D  cis
cis

Using Eq. 19A, this can be approximated to
nout (k)  k  e  k , where
  ln
 , and

D cis
P
D cis
P


D  cis
R


. [30A]
D(  1)  cis
 trans
 trans
K( , m)  1 
 
D  cis 

This is Eq. 15 of the main text. This result is only consistent with the assumption that
nout (k)  Aout k  , when   0.
Shrinking Networks Model
We now consider a model of a shrinking network, in which the rate of gene deletion is
greater than the rate of gene duplication. This model is appropriate as a model of
transcription network evolution immediately following a whole genome duplication,
such as that which occurred in yeast around 100 million years ago (Kellis et al. 2004).
We use a rate of gene duplication D  , and gene deletion D  , such that
D  D   D , [31A],
where D  0 . Firstly note that, the rate at which genes gain new edges through
duplication of other genes is kD  , and the rate at which they lose edges through
deletion of other genes is kD  kD  kD . The rate at which new TFs of outdegree k are produced by this model is D nout (k) , and the rate at which they are lost
is D nout (k) . Therefore TFs with out-degree k are lost at a rate Dnout (k) . Similarly,
TGs with in-degree k are lost at a rate Dnin (k) .
To see that this term is not sufficient produce an out-degree distribution with
exponent   1 , we make the following approximation. Assuming an out-degree of
the form nout (k)  Aout k  we can write using Eq. 8A
Dnin (k) 
D
2( 1)
(k  1)n
out
(k  1)  (k  1)nout (k  1) O(k (1 ) ) , [32A]
Using this with Model 1, we now define
K( , m) 
1
2( 1)
1 
D

trans
 (1  m) 1

, [33A]
Then the solution for the out-degree distribution of this model can be written as
nout (k)  Aout





cis
 trans
 k  D     K( , m)  k ,[34A]
trans
 D        K( , m) 
 1  k 
cis
trans

D   trans
K ( ,m)
Using Eq. 19A, this can be approximated to
nout (k)  k  e  k , where
  ln
  1



cis
 D   trans
K ( ,m)

D   trans
K ( ,m)
and


cis
 trans
. [35A]

D   trans
K( , m)
Since K( , m)  0 from Eq. 33A and D  D , there is no solution   0 for Eq.
35A. Therefore this model cannot produce a power-law out-degree distribution.
References
Chung, F., Lu, L. & Dewey, G. 2003 Duplication Models for Biological Networks.
Journal of Computational Biology 10, 677-687.
Kellis, M., Birren, B. & Lander, E. 2004 Proof and evolutionary analysis of ancient
genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617624.
Download