6A Transcriptional bursting and queueing theory In Sec. 6.3.1 we considered a simple model of transcriptional bursting based on a two-state model of gene regulation. In this model, a gene switches between an onand off-state and mRNA is only produced during the on-phase. However, in eukaryotic cells transcription appears to follow an ordered, multistep and cyclic process, which involves transitions between distinct chromatin states [1, 6]. Schwabe et al. [19] refer to this complex transcription mechanism as a molecular ratchet model, in which the time a single gene spends in the on- and off-phases, and the time between consecutive transcription initiation events are random variables with corresponding waiting time densities f (t), g(t) and h(t), see Fig. 6A.1. The complexity of the underlying molecular mechanisms means that these waiting time densities are not necessarily exponential. 6A.1 Burst distribution Suppose that an on-state persists for some random time Ton = t and let B(t) be the number of transcription initiation events (number of mRNA) that occur over this time interval. (For simplicity, we assume that each mRNA generates the same number of proteins so that burst size is measured in terms of the number of mRNA produced during an on-phase.) Let F(b,t) = P[B(t) ≥ b|Ton = t]. If the waiting time f(t) OFF ON OFF ON OFF ON g(t) initiation of transcription h(t) Fig. 6A.1: Schematic illustration of molecular ratchet model of transcription at the single gene level, where the waiting time distribution of the on-state, off-state, and transcription initiation are denoted by f (t), g(t), h(t). 1 density between events is h(τ) and τ j , j = 1, . . . , b denotes the time of the j-th event, then F(b,t) = Z t 0 H(t − τb )H(τb − τb−1 ) · · · H(τ2 − τ1 )dτ1 · · · dτb ≡ Z t h(τ)(b) dτ, 0 where h(t)(b) denotes the b-th convolution of h(t). The probability that exactly b initiation events occur in the time interval t is Z th i P(b,t) = F(b,t) − F(b + 1,t) = h(τ)(b) − h(τ)(b+1) dτ. 0 The mean burst-size given t is then ∞ hb(t)i = ∑b b=0 Z th 0 i h(τ)(b) − h(τ)(b+1) dτ. Taking Laplace transforms of both sides and using the convolution theorem shows that ∞ h(s)) ∑ bb h(s)b hb b(s)i = s−1 (1 − b b=0 ∞ d b h(s)b = s−1b h(s)(1 − b h(s)) ∑ db h(s) b=0 d 1 = s−1b h(s)(1 − b h(s)) db h(s) 1 − b h(s) = s−1 Hence b h(s) . 1−b h(s) hb(t)i = L −1 A similar analysis establishes that hb(t) i = L 2 −1 " " # 1 b h(s) (t). s 1−b h(s) # 1b h(s)(1 + b h(s)) (t). s (1 − b h(s))2 (6A.1) (6A.2) One now obtains the moments of the burst-size distribution by integrating with respect to all time Ton = t: hbi = Z ∞ 0 hb2 i = hb(t)ig(t)dt, In the case of exponential waiting times 2 Z ∞ 0 b(t)2 g(t)dt. (6A.3) probability pb 0.04 non-exponential 0.02 exponential 0.00 20 0 40 burst size (b) 60 Fig. 6A.2: Schematic illustration of a typical burst-size distribution for exponential and nonexponential on-state time distributions. g(t) = kg e−kg t , h(t) = kh e−kh t , we have so that hb(t)i = L −1 kh (t) = kht, s2 It follows that hbi = b h(s) = kh , kh + s hb(t) i = L kh , kg 2 −1 kh 2kh + s (t) = kh2t 2 + kht. s2 s hb2 i = hbi + 2hbi2 , (6A.4) and thus the Fano factor is Qb ≡ Var[b] = 1 + hbi. hbi (6A.5) The corresponding distribution of burst sizes is a geometric distribution, pb ≡ Z ∞ 0 P(b,t)g(t)dt = p(1 − p)b , p= kg . hh + kg Schwabe et al. [19] explore the effects of non-exponential on-state time distributions, and show that they can lead to a burst size distribution that is peaked (rather than exponentially decreasing) and this reduces transcriptional noise, see Fig. 6A.2. Now suppose that one combines the stochastic process that generates a distribution of burst sizes with the waiting time density f (τ) of being in the off-phase. The resulting system can be mapped into one of the classical models of queueing theory 3 (a) (ii) queue (iii) exiting customers (i) arriving customers server (b) off-phase Y on-phase (iii) degradation (ii) accumulation of protein in cell (i) transcriptional bursts Fig. 6A.3: Diagram illustrating the mapping between queuing theory and trancriptional bursting. (a) Example of a single-server queue. (b) Transcriptional bursting. The stochastic switching of a gene between an on-phase and an off-phase generates a a sequence of transcriptional bursts that is analogous to the arrival of customers in the queueing model. This results in the accumulation of proteins within the cell, which is the analog of a queue. Degradation corresponds to exiting of customers after being serviced by an infinite number of servers. [3], see Fig. 6A.3. Queuing theory concerns the mathematical analysis of waiting lines formed by customers randomly arriving at some service station, and staying in the system until they receive service from a group of servers. Different types of queueing process are defined in terms of (i) the stochastic process underlying the arrival of customers, (ii) the distribution of the number of customers (batches) in each arrival, (iii) the stochastic process underlying the departure of customers (servicetime distribution), and (iv) the number of servers. The above model of transcriptional bursting can be mapped to a queueing process as follows: individual mRNAs are analogous to customers, transcriptional bursts correspond to customers arriving in batches, and the degradation of mRNAs is the analog of customers exiting the system after being serviced. Thus, the waiting-time density for mRNA degradation is the analog of the service-time distribution. Finally, since the mRNAs are degraded independently of each other, the effective number of servers in the corresponding queueing model is infinite, that is, the presence of other customers does not affect the service time of an individual customer. 4 The particular queuing model that maps to the model of transcriptional bursting is the GI X /M/∞ system, Here the symbol G denotes a general waiting time density for the arrival process, that is, the waiting time density f (τ) for the gene to switch from the off-phase to the on-phase, and I X refers to the fact that customers (mRNAs) arrive in batches of independently distributed random sizes B (mRNA burst sizes). The symbol M stands for a Markovian or exponential service-time density (mRNA degradation-time density), and ‘∞’ denotes infinite servers. Exploiting this mapping, exact results for the moments of the mRNA steady-state distribution can be obtained from the corresponding known expressions for the moments of the steady-state distribution of the number of current customers in the GI X /M/∞ model [3]. The latter were originally derived by Liu et al. [4], and we will follow their analysis in Sec. 6A.2. Here we simply quote the results for the mean and variance of the steady-state mRNA copy number N: λ (6A.6) hNi = hbi, µ and fb(µ) hb2 i − hbi + 2hbi2 1 − fb(µ) µ Var[N] = hNi − hNi + hNi2 λ 2 ! (6A.7) In the special case that the arrival times are also exponentially distributed, fb(s) = λ /(s + λ ), and the variance simplifies to Var[N] = λ 2 hb i. µ (6A.8) 6A.2 Moments of GI X /M/∞ queueing model [4] Let F(t) and H(t) be the probability distributions for the interarrival times and the service times, respectively. It is assumed that the mean arrival rate λ and the mean service rate µ are finite. Since the distribution of service times is taken to be exponential, we have S(t) = 1 − e−µt . The arrivals are taken to occur in batches of variable size X, where P[X = b] = pb , b = 1, 2, . . . , with finite mean and variance, and generating function ∞ A(z) = ∑ pb zb . r=1 It follows that for any positive integer k, the k-th factorial moment of X is given by 5 Ak = ∞ d k A(z) = ∑ b(b − 1) · · · (b − k + 1)pb . k dz z=1 r=k (6A.9) Let Tn be the time of the n-th group arrival, with Xn denoting the size of the batch and Sni the service time of the ith member of the batch. The number of busy servers at time is then (6A.10) N(t) = ∑ χ(t − Tn , Xn ), 0≤Tn ≤t where Xn χ(t − Tn , Xn ) = ∑ I(t − Tn , Sni ), i=1 I(t − Tn , Sni ) = 1 if t − Tn ≤ Sni 0 if t − Tn > Sni (6A.11) In other words, we are keeping track of all batches that arrived prior to time t and the number of customers in each batch that haven’t yet exited the system. Introduce the generating function ∞ G(z,t) = ∑ zk P[N(t) = k], (6A.12) k=0 and the binomial moments ∞ Br (t) = k! ∑ (k − r)!r! P[N(t) = k], r = 1, 2, . . . . (6A.13) k=r Suppose that the system is empty at time t = 0. We shal derive an integral equation for the generating function G(z,t). Conditioning on the first arrival time by setting T1 = y, it follows that N(t) = χ(t − y, X1 ) + N ∗ (t − y) if y ≤ t and zero otherwise, where χ(t − y, X1 ) and N ∗ (t − y) are independent of each other, and N ∗ (t) has the same distribution as N(t). Moreover, P[I(t − y, S1i ) = k] = P[t − y ≤ S1i ]δk,1 + P[t − y > S1i ]δk,0 = [1 − H(t − y)]δk,1 + H(t − y)δk,0 , so that ∞ ∑ zk P[I(t − y, S1i ) = k] = z + (1 − z)H(t − y). k=0 Since I(t − y, S1i ) for i = 1, 2, . . . are independent and identically distributed, the theorem of total expectation implies that 6 E[zχ(t−y,X1 ) ] = E[E[zχ(t−y,X1 ) |X1 ]] = b ∞ = ∑ pb ∏ E[z b=1 ∞ = I(t−y,S1i ) ∞ b ∑ pb E[z∑i=1 I(t−y,S1i ) ] b=1 b ∞ ]= i=1 ∞ ! k ∑ pb ∏ ∑ z P[I(t − y, S1i ) = k] i=1 b=1 k=0 ∑ pb [z + (1 − z)H(t − y)]b = A[z + (1 − z)H(t − y)]. b=1 Another application of the theorem of total expectation gives G(z,t) = E[zN(t) ] = E[E[zN(t) |y]] = Z ∞ t Z t E[z0 ]dF(y) + = 1 − F(t) + Z t 0 0 E[zχ(t−y,X1 ) ]E[zN ∗ (t−y) ]dF(y) G(z,t − y)A[z + (1 − z)H(t − y)]dF(y). (6A.14) One can now obtain an iterative equation for the Binomial moments by differentiating equation (6A.14 multiple times with respect to z and using 1 d r G(z,t) . Br (t) = r! dzr z=1 Noting that dr = [1 − H(t − y)]r Ar , A[z + (1 − z)H(t − y)] r dz z=1 r = 1, 2, . . . we obtain the integral equation Br (t) = Z t 0 r Ak k=1 k! Br (t − y)dF(y) + ∑ Z t 0 Br−k (t − y)[1 − H(t − y)]k dF(y). (6A.15) This can be written in the more compact form Br (t) = Z t 0 Br (t − y)dF(y) + where Hr (t) = Z t 0 Hr (t − y)dF(y), (6A.16) r Ak Br−k (t)[1 − H(t)]k . k! k=1 ∑ (6A.17) Equation (6A.16) is an example of a renewal-type equation, see equation (6A.30) of Box S6A, which has the unique solution Br (t) = Hr (t) + Z t 0 where 7 Hr (t − y)dm(y), (6A.18) ∞ m(y) = Fn (y) = P[Tn ≤ y] ∑ Fn (y), n=1 (6A.19) is the so-called renewal function. The steady-state Binomial moments can now be obtained by taking the limit t → ∞ in equation (6A.18) and using the Key Renewal Theorem of Box S6A: r Br = λ Ak ∑ k! k=1 Z ∞ 0 Br−k (t)[1 − H(t)]k dt, (6A.20) where λ −1 = E[T1 ]. In the case of an exponential service-time distribution, 1 − H(t) = e−µt so that r Ak Br = λ ∑ Bbr−k (kµ), (6A.21) k=1 k! where Bbr (s) is the Laplace transform of Br (t). An iterative equation for the Laplace transforms can be obtained by Laplace transforming equations (6A.16) and (6A.17), and using the convolution theorem: cr (s) fb(s), Bbr (s) = Bbr (s) fb(s) + H which, on rearranging, yields and Bbr (s) = cr (s) = H fb(s) c Hr (s), 1 − fb(s) (6A.22) r Ak b Br−k (s + kµ), k=1 k! ∑ (6A.23) Note that B0 (t) = 1 so Bb0 (s) = 1/s and, hence, Bb1 (s) = fb(s) A1 . 1 − fb(s) s + µ Equations (6A.21) and (6A.22) determine completely the steady-state Binomial moments. In particular, λ B1 ≡ hNi = A1 , (6A.24) µ and 8 1 A2 hN 2 i − hNi = λ A1 Bb1 (µ) + 2 4µ ! fb(µ) A2 λ A1 A1 + = b 2µ 1 − f (µ) 2A1 B2 ≡ (6A.25) Box S6A: The renewal equation [2]. A renewal process N(t) is defined according to N(t) = max{n : Tn ≤ t}, T0 = 0, Tn = X1 + . . . + Xn (6A.26) for n ≥ 1, where {X j } is a sequence of independent identically distributed non-negative random variables. It immediately follows that N(t) ≥ n if and only if Tn ≤ t. Typically, N(t) represents the number of occurrences of some event in the time interval [0,t]. In the case of queuing theory, this could be the number of customer batches that have arrived for service. Thus Tn is the n-th arrival time and Xn is the n-th inter-arrival time. In neurobiology, N(t) could represent the number of times a neuron has fired an action potential in a time interval [0,t], Tn is the n-th firing time, and Xn is the n-th inter-spike interval (see also Sec. 3.5). In the following we will assume that X j is strictly positive, which ensures that P[N(t) < ∞] = 1 for all t. Introduce the inter-arrival distribution F(t) = P[X1 ≤ t] and let Fk (t) = P[Tk ≤ t] be the distribution function of the k-th arrival time Tk . Clearly F1 = F. From the identity Tk+1 = Tk + Xk+1 , one has the iterative equation Fk+1 (t) = Z t 0 Fk (t − y)dF(y), k ≥ 1. (6A.27) Moreover, P[N(t) = k] = P[N(t) ≥ k] − P[N(t) ≥ k + 1] = P[Tk ≤ t] − P[Tk+1 ≤ t] = Fk (t) − Fk+1 (t). An important object in renewal theory is the renewal function m ∞ m(t) ≡ E[N(t)] = ∑ Fk (t). (6A.28) k=1 The last expression results by expressing N(t) as a sum of Heaviside functions 9 ∞ N(t) = ∑ H(t − Tk ), k=1 so that m(t) = E " ∞ # ∑ H(t − Tk ) k=1 ∞ ∞ ∑ E[H(t − Tk )] = ∑ Fk (t). = k=1 k=1 A fundamental result of renewal theory is that m satisfies the renewal equation Z t m(t) = F(t) + m(t − x)dF(x). (6A.29) 0 This can be established using the theorem of total expectation: m(t) = E[N(t)] = E[E[N(t) | X1 ]] = = = Z ∞ 0 E[N(t) | X1 = x]dF(x) 0 E[N(t) | X1 = x]dF(x) + 0 (1 + E[N(t − x)])dF(x) = Z t Z t Z ∞ t Z t 0 E[N(t) | X1 = x]dF(x) (1 + m(t − x))dF(x). We have used the fact that E[N(t) | X1 = x] = 0 if t < x. It can also be shown that m(t) = ∑∞ k=1 Fk (t) is the unique solution to the renewal equation (6A.29) that is bounded on finite intervals. An important generalization of (6A.29) is the renewal-type equation C(t) = H (t) + Z t 0 C(t − x)dF(x), (6A.30) where H is a uniformly bounded function. The solution of the latter equation is specified by the following theorem (see [2]): Theorem 6A.1. The function C(t) = H (t) + Z t 0 H (t − y)dm(y) (6A.31) is a solution of the renewal-type equation (6A.30). Moreover, if H is bounded on finite intervals then so is C, and the solution is unique. Proof. If h : [0, ∞) → R, then define the functions h ∗ m and h ∗ F by (h ∗ m)(t) = Z t 0 h(t − x)dm(x), (h ∗ F)(t) = 10 Z t 0 h(t − x)dF(x), assuming the integrals exist. From the definitions, we have m ∗ F = F ∗ m, (h ∗ m) ∗ F = h ∗ (m ∗ F). Moreover, the iterative equation (6A.27), the renewal equation (6A.29) and the solution (6A.31) can be written in the compact forms Fk+1 = Fk ∗ F = F ∗ Fk , C = H + H ∗ m. m = F + F ∗ m, Convolving C with respect to F gives C∗F = H ∗F +H ∗m∗F = H ∗ F + H ∗ (m − F) = H ∗m =C−H , which establishes that the function C is a solution of the renewal-type equation (6A.30). It remains to prove that if H is bounded on finite intervals then the solution is unique. If H is bounded on finite intervals [0, T ], then from equation (6A.31) Z t sup |C(t)| ≤ sup |H(t)| + sup H (t − y)dm(y) 0≤t≤T 0≤t≤T 0≤t≤T 0 ≤ (1 + m(T )) sup |H(t)| < ∞. 0≤t≤T Finiteness of the renewal function means that the solution C is also bounded on finite intervals. Now suppose that Cb is another bounded solution of (6A.30) b and set ∆ (t) = C(t) − C(t). Equations (6A.27) and (6A.30) imply that ∆ = ∆ ∗ F = ∆ ∗ F ∗ F = ∆ ∗ F2 = ∆ ∗ F ∗ F2 = ∆ ∗ F3 · · · , that is, ∆ = ∆ ∗ Fk , k ≥ 1. Therefore, |∆ (t)| ≤ Fk (t) sup |∆ (u)|, 0≤u≤t k ≥ 0. Finally, taking the limit k → ∞ and noting that Fk (t) = P[N(t) ≥ k] → 0 as k → ∞, we conclude that |∆ (t)| = 0 for all t. 11 We end our brief discussion of renewal theory by stating several limit theorems. Set µ = E[X1 ] < ∞. Theorem 6A.2 (Elementary Renewal Theorem). N(t) a.s. 1 → as t → ∞. t µ Theorem 6A.3 (Key Renewal Theorem). Define a random variable Y and its distribution FY to be arithmetic with span λ , λ > 0, if Y takes values in the set {mλ , m ∈ Z} with probability one, and λ is the maximal value with this property. Suppose that the first inter-arrival time X1 is notR arithmetic. If g : [0, ∞) → [0, ∞) is such that g is monotone decreasing and 0∞ g(t)dt < ∞, then Z Z t 1 ∞ g(x)dx as t → ∞. g(t − x)dm(x) → µ 0 0 Supplementary references 1. Berger, S. L.: The complex language of chromatin regulation during transcription. Nature 447, 407-412 (2007) 2. Grimmett, G. R., Stirzaker, D. R.: Probability and random processes 3rd ed. Oxford University Press, Oxford (2001). 3. Kumar, N., Singh, A., Kulkarni, R. V.: Transcriptional bursting in gene expression: analytical results for general stochastic models. PLoS Comp. Biol. 11 e1004292 (2015) 4. Liu, L., Kashyap, B. R. K., Templeton, J. G. C.: On the GI X /G/∞ system. J. Appl. Prob., 27, 671-683 (1990). 5. Schwabe, A., Rybakova, K. N., Bruggeman, F. J.: Transcription stochasticity of complex gene regulation models. Biophys J 103 1152-1161 (2011). 6. Weake, V. M., Worlkman, J. L.: Inducible gene expression: diverse regulatory mechanisms. Nat. Rev. Genet. 11 426-437 (2010) 12