Appendix - Proceedings of the Royal Society B

Degree Dependence in Rates of Transcription Factor Evolution Explains the Unusual Structure of Transcription Networks – Supporting Material Derivation of Equilibrium Degree Distributions Alexander J Stewart1,2,3, Robert M Seymour1,4 and Andrew Pomiankowski1,2 1 CoMPLEX, UCL, Physics Building , Gower Street, London, WC1E 6BT, U.K. 2 The Galton Laboratory, Research Department of Genetics, Evolution and Environment, UCL, 4 Stephenson Way, London NW1 2HE, U.K. 3 Department of Mathematics, UCL, Gower Street, London, WC1E 6BT, U.K. 4 Correspondence should be addressed to alex.stewart@ucl.ac.uk Model Networks in the model consist of transcription factors (TFs) and target genes (TGs). TFs regulate other genes and may be regulated by other TFs. TGs are only regulated, and therefore have only incoming edges. The model includes four distinct types of mutation – trans mutation, cis mutation, gene duplication and gene deletion. Each type of mutation is associated with a rate. The mutation rate is the rate at which each type of mutation becomes fixed in the network (not the rate at which new mutations actually occur, which may differ considerably from the rate at which they are fixed). The rates at which different mutations become fixed are referred to as the rates of evolution. Rates of evolution may be the same for all genes (either TFs or TGs), or else they may vary with the connectivity of a gene. Initially we consider constant rates of evolution. The network is updated by time increments t . These increments are taken to be sufficiently small so that at most one mutation is fixed in the network within each increment. We choose a time scalesuch that t  1. The rates of evolution in the model are therefore the probabilities that a mutation will occur (and be fixed) in a  time interval. Mutations at trans and cis elements effect edges but do not effect nodes in the network. Gene duplication and deletion effect both nodes and edges. When a gene is duplicated the network is increased in size by one. When a gene is deleted, the network is decreased in size by one. Rates of gene duplication and deletion are taken to be equal. Therefore the expected size of the network remains constant. The actual size of the network undergoes random variation about some mean size. In our simulations we place upper and lower limits on the absolute size of the network. At the upper boundary, gene duplication events are forbidden. At the lower boundary gene deletions are forbidden. The rates of duplication and deletion may differ between TFs and TGs. We calculate the expected equilibrium degree distributions for the in- and out- degree of networks. The evolution of the in-degree distribution, nin (k,t) , in the mean field approximation is given by Eq. 4 of the main text:      in in in in nin  TG  (k  1)  TF  (k  1) nin (k  1,t)  TG  (k  1)  TF  (k  1) nin (k  1,t)   in TG   in in in (k)  TF  (k)  TG  (k)  TF  (k) nin (k,t) , [1A] in in where TG  (k) and TG  (k) are the probabilities of a gene with in-degree k gaining or in in loosing an edge through a mutation at the gene and TF  (k) and TF  (k) are the probabilities of a gene with in-degree k gaining or loosing an edge through trans mutation, duplication or deletion at a TF in the time interval t. The time evolution of the out-degree distribution nout (k,t) is given by Eq. 5 of the main text:    out out out nout  TG  (k  1)  TF  (k  1) nout (k  1,t)  TG  (k  1)nout (k  1,t)   out TG   N k jk j0 out out out out (k)  TG  (k)  TF  (k) nout (k,t)   TF  ( j, k)nout ( j,t)   TF  (k, j)nout (k,t) , [2A] out where N is the total number of genes (TFs and TGs) in the network, TG  (k) and out TG  (k) are the probabilities of a gene with out-degree k gaining or losing an edge out through mutation at one of its targets, TF  (k) is the probability that a TF with out- out degree k gains a target through mutation at the TF, and TF  ( j, k) is the probability that a TF with out-degree j  k loses interactions to become a TF with out-degree k due to mutation at the TF. Setting the left hand side of Eq. 1A to zero, this has equilibrium solution satisfying  in TF  To solve N    an approximation in in in (k  1)  TG  (k  1) nin (k  1)  TF  (k)  TG  (k) nin (k) Eq. 2A we make , [3A] for the term k  out TF  jk out ( j, k)nout ( j,t)   TF  (k, j)nout (k,t) , which describes loss of interactions j0 through trans evolution. For the model without degree dependence in the rate of trans out evolution (Model 1 and Model 3 in the main text), TF  ( j, k) is given by out  TF  ( j, k)  trans j! k!( j  k )! m j  k (1  m)k . [4A] By assuming a solution of the form nout (k)  Aout k  , where Aout is a normalization constant for the out-degree distribution, we can use the approximation N k  out TF  jk   out ( j, k)nout ( j)   TF  (k, j)nout (k) j0  trans out  trans  1 n (k)(1  m) 2( 1)   trans nout (k)  O(k (1  ) ) 1  (1  m) (k  1)n  1 out . [5A] (k  1)  (k  1)nout (k  1) O(k (1  ) ) Observe that, when   1 the right hand side of Eq. 5A is zero. To derive this, first k note from Eq. 4A that  out TF   (k, j)nout (k)  trans nout (k) . We use Lemma 2 of Chung j0 N et al (2003) to show that  out TF   ( j, k)nout ( j)  trans nout (k)(1  m) 1  O(k (1  ) ) . To jk see this, we have N  out TF  ( j, k)nout ( j)  Aout  jk  trans  jk k   m (1  m) j jk  j  k  N   j N  j  jk   Aout trans k   (1  m)k   m jk  j  k         Aout trans k   (1  m)k 1  O(k 1 ) k j   j   jk m j  k jk      Aout trans k   (1  m)k 1  O(k 1 )  N [6A]  i k   i m i  i0  N k   1  O(k )   Aout trans k   (1  m)k 1  O(k 1 ) (1  m)  k 1   trans nout (k)(1  m) 1 1 valid for N  k (in fact, exactly valid in the limit N   ). We now have N k jk j0   nout (k)(1  m) 1  trans nout (k)  O(k (1  ) )  TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans ,[7A] Using our assumed solution form, we can also write (k  1)nout (k  1)  knout (k)  knout (k)  (k  1)nout (k  1)  2(1   )nout (k)  O(k ( 1) ) ,[8A] Eq. 7A and Eq. 8A combine to give Eq. 5A. For large k we can neglect terms O(k (1  ) ) and define K( , m)  N k jk j0 1 2( 1) 1  (1  m) . This allows us to write  1  K( , m)(k  1)nout (k  1)  (k  1)nout (k  1)  TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans , [9A] which is the form used in Eqs. 7 of the main text. For the model with degree dependence in the rate of trans evolution (Model 2 and out Model 4 in the main text), TF  ( j, k) is given by out  TF  ( j, k)  trans j! m jk (1 m)k k !( j  k )! j . [10A] Once again assuming a solution of the form nout (k)  Aout k  , we can use the approximation N k  TFout ( j, k)nout ( j)   TFout (k, j)nout (k) jk   j0 nout (k )  trans k  trans 2  (1  m)  trans  1  (1  m) n  out nout (k ) k  O(k (2   ) ) . [11A] (k  1)  nout (k  1) O(k (2   ) ) k out  To derive this, first note that in this case  TF  (k, j)nout (k)  trans j0 use Lemma N  out TF  jk 2 of  ( j, k)nout ( j)  trans nout (k ) k (Chung et  al.  2003) to nout (k ) k . Again we show (1  m) 1  O(k 1 ) . To see this, we have that N  jk out TF  N  j   ( j, k)nout ( j)  Aout trans   j  k  m j  k (1  m)k j (1 ) jk   N  j  jk   Aout trans k (1  ) (1  m)k   m jk  j  k           Aout trans k (1  ) (1  m)k 1  O(k 1 ) k j 1   j   1  jk m j  k jk      Aout trans k (1  ) (1  m)k 1  O(k 1 )  N [12A]  i  k   1  i m i  i0  N k   Aout trans k (1  ) (1  m)k 1  O(k 1 ) (1  m)  k   trans nout (k ) k  (1  m) 1  O(k 1 )  valid for N  k (in fact, exactly valid in the limit N   ). We now have N k   TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans jk j0 nout (k ) k  (1  m)  trans nout (k ) k  O(k (2  ) ) , [13A] Using our assumed solution form, we can also write nout (k  1)  nout (k)  nout (k)  nout (k  1)  2 nout (k ) k    O k (2  ) ,[14A] Eq. 13A and Eq. 14A combine to give Eq. 11A. For large k we can neglect terms   O k (2  ) and higher, and define K( , m)  N k jk j0 1 2 1  (1  m) . This allows us to write   K( , m)nout (k  1)  nout (k  1), [15A]  TFout ( j, k)nout ( j)   TFout (k, j)nout (k)  trans which is the form used in Eqs. 7 of the main text. Using Eq. 9A, the solution to Eq. 2A for the out-degree distribution is    out TG    (k  1)  (k  1)trans K( , m) nout (k  1) out TG   out  (k)  TF  (k)  k trans K( , m) nout (k) , [16A] for the model excluding degree distribution in the rate of trans evolution. And using Eq. 15A gives    out TG    (k  1)  trans K( , m) nout (k  1) out TG   out  (k)  TF  (k)  trans K( , m) nout (k) , [17A] for the model including degree distribution in the rate of trans evolution. Solution to Model 1 – No Connectivity Dependence The solution to Eq. 3A for the in-degree distribution, using the incoming edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nin (k)  Ain      cis  trans k k   ,[18A] D     1  k   D  cis  mtrans D where Ain is a normalization constant. Following (Chung et al. 2003) we can write (x  c)  1  O 1x  x c . [19A] (x)   For large x, the terms O1x  can be neglected, and hence Eq 18A gives  nin (k)  k   e k , where    ln 1    1   cis  m trans D   cis  trans D and . [20A] This is Eq. 8 of the main text. The solution to Eq. 16A for the out-degree distribution, using the outgoing edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nout (k)  Aout      cis  trans  k  D    K( , m)  k ,[21A] trans  D       K( , m)   1  k   D trans K ( ,m) cis trans Using Eq. 19A, this can be approximated to nout (k)  k  e  k , where   ln   1    cis  D trans K ( ,m)  D trans K ( ,m) and   cis  trans . [22A]  D  trans K( , m) This is Eq. 9 of the main text. This result is only consistent with the assumption that nout (k)  Aout k  , when   0. Solution to Model 2 – Degree Dependence in the Rate of trans Evolution The solution to Eq. 3A for the in-degree distribution, using the incoming edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nIn (k)  Ain      cis  trans k  D    1  k   D  cis  m D k 1 k  ,[23A]  trans  N Where 1 k  j 1 nout ( j ) j determines the mean rate of trans evolution across the network. Using Eq. 19A this is approximately nin (k)  k   e k ,   cis m   ln 1    1 1 k where  trans D   cis  trans D  , and . [24A] This is Eq. 10 of the main text. The solution to Eq. 17A for the out-degree distribution, using the outgoing edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nout (k)  Aout      cis  trans  trans K ( , m)   k  D  k ,[25A]      K ( ,m)  1  transD     k  D  cis   D cis  Using Eq. 19A, this can be approximated to nout (k)  k  e  k , where  , and   ln 1   cis D  D  . [22A]    D(  1)  cis  trans  trans K( , m)  1    D  cis   This is Eq. 11 of the main text. This result is only consistent with the assumption that nout (k)  Aout k  , when   0. Solution to Model 3 – Preferential Attachment The solution to Eq. 3A for the in-degree distribution, using the incoming edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nIn (k)  Ain     R cis  trans k P k   ,[23A] D  trans     1  k   D  cis  mtrans P D trans Using Eq. 19A this is approximately nin (k)  k   e k , where    ln 1    1   cis  m trans P D  trans  R cis  trans P D  trans and . [24A] This is Eq. 12 of the main text. The solution to Eq. 16A for the out-degree distribution, using the outgoing edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nout (k)  Aout   R  cis  trans P  D cis  trans K ( ,m)  1  k    k  D   P    K( , m)  k ,[25A] cis trans  D       K( , m)  cis trans Using Eq. 19A, this can be approximated to nout (k)  k  e  k , where   ln   1    cis  D trans K ( ,m) P  D cis  trans K ( ,m) and R  cis  trans . [26A] P  D  cis  trans K( , m) This is Eq. 13 of the main text. This result is only consistent with the assumption that nout (k)  Aout k  , when   0. Solution to Model 4 – Degree Dependence and Preferential Attachment The solution to Eq. 3A for the in-degree distribution, using the incoming edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nIn (k)  Ain    R cis  trans  k P k   ,[27A] D  trans     1  k   D  cis  m 1k trans  P D  trans N Where 1 k  j 1 nout ( j ) j determines the mean rate of trans evolution across the network. Using Eq. 19A this is approximately nin (k)  k   e k ,    ln 1    1 where  cis m  1 trans k P D trans  R cis  trans P D  trans  , and . [28A] This is Eq. 14 of the main text. The solution to Eq. 17A for the out-degree distribution, using the outgoing edge event probabilities from Table 2 (main text), gives an equilibrium degree distribution nout (k)  Aout   R   cis  trans  trans K ( , m)   k  D   P  k ,[29A] cis   D     trans K ( ,m)  1  D    k cis  P D  cis cis  Using Eq. 19A, this can be approximated to nout (k)  k  e  k , where   ln  , and  D cis P D cis P   D  cis R   . [30A] D(  1)  cis  trans  trans K( , m)  1    D  cis   This is Eq. 15 of the main text. This result is only consistent with the assumption that nout (k)  Aout k  , when   0. Shrinking Networks Model We now consider a model of a shrinking network, in which the rate of gene deletion is greater than the rate of gene duplication. This model is appropriate as a model of transcription network evolution immediately following a whole genome duplication, such as that which occurred in yeast around 100 million years ago (Kellis et al. 2004). We use a rate of gene duplication D  , and gene deletion D  , such that D  D   D , [31A], where D  0 . Firstly note that, the rate at which genes gain new edges through duplication of other genes is kD  , and the rate at which they lose edges through deletion of other genes is kD  kD  kD . The rate at which new TFs of outdegree k are produced by this model is D nout (k) , and the rate at which they are lost is D nout (k) . Therefore TFs with out-degree k are lost at a rate Dnout (k) . Similarly, TGs with in-degree k are lost at a rate Dnin (k) . To see that this term is not sufficient produce an out-degree distribution with exponent   1 , we make the following approximation. Assuming an out-degree of the form nout (k)  Aout k  we can write using Eq. 8A Dnin (k)  D 2( 1) (k  1)n out (k  1)  (k  1)nout (k  1) O(k (1 ) ) , [32A] Using this with Model 1, we now define K( , m)  1 2( 1) 1  D  trans  (1  m) 1  , [33A] Then the solution for the out-degree distribution of this model can be written as nout (k)  Aout      cis  trans  k  D     K( , m)  k ,[34A] trans  D        K( , m)   1  k  cis trans  D   trans K ( ,m) Using Eq. 19A, this can be approximated to nout (k)  k  e  k , where   ln   1    cis  D   trans K ( ,m)  D   trans K ( ,m) and   cis  trans . [35A]  D   trans K( , m) Since K( , m)  0 from Eq. 33A and D  D , there is no solution   0 for Eq. 35A. Therefore this model cannot produce a power-law out-degree distribution. References Chung, F., Lu, L. & Dewey, G. 2003 Duplication Models for Biological Networks. Journal of Computational Biology 10, 677-687. Kellis, M., Birren, B. & Lander, E. 2004 Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617624.

Appendix - Proceedings of the Royal Society B

Related documents

Products

Support

Appendix - Proceedings of the Royal Society B

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib