6A Transcriptional bursting and queueing theory

advertisement
6A Transcriptional bursting and queueing theory
In Sec. 6.3.1 we considered a simple model of transcriptional bursting based on a
two-state model of gene regulation. In this model, a gene switches between an onand off-state and mRNA is only produced during the on-phase. However, in eukaryotic cells transcription appears to follow an ordered, multistep and cyclic process,
which involves transitions between distinct chromatin states [1, 6]. Schwabe et al.
[19] refer to this complex transcription mechanism as a molecular ratchet model, in
which the time a single gene spends in the on- and off-phases, and the time between
consecutive transcription initiation events are random variables with corresponding waiting time densities f (t), g(t) and h(t), see Fig. 6A.1. The complexity of the
underlying molecular mechanisms means that these waiting time densities are not
necessarily exponential.
6A.1 Burst distribution
Suppose that an on-state persists for some random time Ton = t and let B(t) be
the number of transcription initiation events (number of mRNA) that occur over
this time interval. (For simplicity, we assume that each mRNA generates the same
number of proteins so that burst size is measured in terms of the number of mRNA
produced during an on-phase.) Let F(b,t) = P[B(t) ≥ b|Ton = t]. If the waiting time
f(t)
OFF
ON
OFF
ON
OFF
ON
g(t)
initiation of
transcription
h(t)
Fig. 6A.1: Schematic illustration of molecular ratchet model of transcription at the single gene
level, where the waiting time distribution of the on-state, off-state, and transcription initiation are
denoted by f (t), g(t), h(t).
1
density between events is h(τ) and τ j , j = 1, . . . , b denotes the time of the j-th event,
then
F(b,t) =
Z t
0
H(t − τb )H(τb − τb−1 ) · · · H(τ2 − τ1 )dτ1 · · · dτb ≡
Z t
h(τ)(b) dτ,
0
where h(t)(b) denotes the b-th convolution of h(t). The probability that exactly b
initiation events occur in the time interval t is
Z th
i
P(b,t) = F(b,t) − F(b + 1,t) =
h(τ)(b) − h(τ)(b+1) dτ.
0
The mean burst-size given t is then
∞
hb(t)i =
∑b
b=0
Z th
0
i
h(τ)(b) − h(τ)(b+1) dτ.
Taking Laplace transforms of both sides and using the convolution theorem shows
that
∞
h(s)) ∑ bb
h(s)b
hb
b(s)i = s−1 (1 − b
b=0
∞
d
b
h(s)b
= s−1b
h(s)(1 − b
h(s))
∑
db
h(s) b=0
d
1
= s−1b
h(s)(1 − b
h(s))
db
h(s) 1 − b
h(s)
= s−1
Hence
b
h(s)
.
1−b
h(s)
hb(t)i = L
−1
A similar analysis establishes that
hb(t) i = L
2
−1
"
"
#
1 b
h(s)
(t).
s 1−b
h(s)
#
1b
h(s)(1 + b
h(s))
(t).
s (1 − b
h(s))2
(6A.1)
(6A.2)
One now obtains the moments of the burst-size distribution by integrating with respect to all time Ton = t:
hbi =
Z ∞
0
hb2 i =
hb(t)ig(t)dt,
In the case of exponential waiting times
2
Z ∞
0
b(t)2 g(t)dt.
(6A.3)
probability pb
0.04
non-exponential
0.02
exponential
0.00
20
0
40
burst size (b)
60
Fig. 6A.2: Schematic illustration of a typical burst-size distribution for exponential and nonexponential on-state time distributions.
g(t) = kg e−kg t ,
h(t) = kh e−kh t ,
we have
so that
hb(t)i = L
−1
kh
(t) = kht,
s2
It follows that
hbi =
b
h(s) =
kh
,
kh + s
hb(t) i = L
kh
,
kg
2
−1
kh 2kh + s
(t) = kh2t 2 + kht.
s2
s
hb2 i = hbi + 2hbi2 ,
(6A.4)
and thus the Fano factor is
Qb ≡
Var[b]
= 1 + hbi.
hbi
(6A.5)
The corresponding distribution of burst sizes is a geometric distribution,
pb ≡
Z ∞
0
P(b,t)g(t)dt = p(1 − p)b ,
p=
kg
.
hh + kg
Schwabe et al. [19] explore the effects of non-exponential on-state time distributions, and show that they can lead to a burst size distribution that is peaked (rather
than exponentially decreasing) and this reduces transcriptional noise, see Fig. 6A.2.
Now suppose that one combines the stochastic process that generates a distribution of burst sizes with the waiting time density f (τ) of being in the off-phase. The
resulting system can be mapped into one of the classical models of queueing theory
3
(a)
(ii) queue
(iii) exiting
customers
(i) arriving
customers
server
(b)
off-phase
Y
on-phase
(iii) degradation
(ii) accumulation of
protein in cell
(i) transcriptional bursts
Fig. 6A.3: Diagram illustrating the mapping between queuing theory and trancriptional bursting.
(a) Example of a single-server queue. (b) Transcriptional bursting. The stochastic switching of a
gene between an on-phase and an off-phase generates a a sequence of transcriptional bursts that
is analogous to the arrival of customers in the queueing model. This results in the accumulation
of proteins within the cell, which is the analog of a queue. Degradation corresponds to exiting of
customers after being serviced by an infinite number of servers.
[3], see Fig. 6A.3. Queuing theory concerns the mathematical analysis of waiting
lines formed by customers randomly arriving at some service station, and staying
in the system until they receive service from a group of servers. Different types of
queueing process are defined in terms of (i) the stochastic process underlying the arrival of customers, (ii) the distribution of the number of customers (batches) in each
arrival, (iii) the stochastic process underlying the departure of customers (servicetime distribution), and (iv) the number of servers. The above model of transcriptional bursting can be mapped to a queueing process as follows: individual mRNAs
are analogous to customers, transcriptional bursts correspond to customers arriving
in batches, and the degradation of mRNAs is the analog of customers exiting the system after being serviced. Thus, the waiting-time density for mRNA degradation is
the analog of the service-time distribution. Finally, since the mRNAs are degraded
independently of each other, the effective number of servers in the corresponding
queueing model is infinite, that is, the presence of other customers does not affect
the service time of an individual customer.
4
The particular queuing model that maps to the model of transcriptional bursting
is the GI X /M/∞ system, Here the symbol G denotes a general waiting time density
for the arrival process, that is, the waiting time density f (τ) for the gene to switch
from the off-phase to the on-phase, and I X refers to the fact that customers (mRNAs)
arrive in batches of independently distributed random sizes B (mRNA burst sizes).
The symbol M stands for a Markovian or exponential service-time density (mRNA
degradation-time density), and ‘∞’ denotes infinite servers. Exploiting this mapping,
exact results for the moments of the mRNA steady-state distribution can be obtained
from the corresponding known expressions for the moments of the steady-state distribution of the number of current customers in the GI X /M/∞ model [3]. The latter
were originally derived by Liu et al. [4], and we will follow their analysis in Sec.
6A.2. Here we simply quote the results for the mean and variance of the steady-state
mRNA copy number N:
λ
(6A.6)
hNi = hbi,
µ
and
fb(µ)
hb2 i − hbi
+
2hbi2
1 − fb(µ)
µ
Var[N] = hNi − hNi + hNi2
λ
2
!
(6A.7)
In the special case that the arrival times are also exponentially distributed, fb(s) =
λ /(s + λ ), and the variance simplifies to
Var[N] =
λ 2
hb i.
µ
(6A.8)
6A.2 Moments of GI X /M/∞ queueing model [4]
Let F(t) and H(t) be the probability distributions for the interarrival times and the
service times, respectively. It is assumed that the mean arrival rate λ and the mean
service rate µ are finite. Since the distribution of service times is taken to be exponential, we have S(t) = 1 − e−µt . The arrivals are taken to occur in batches of
variable size X, where
P[X = b] = pb ,
b = 1, 2, . . . ,
with finite mean and variance, and generating function
∞
A(z) =
∑ pb zb .
r=1
It follows that for any positive integer k, the k-th factorial moment of X is given by
5
Ak =
∞
d k A(z) = ∑ b(b − 1) · · · (b − k + 1)pb .
k
dz
z=1
r=k
(6A.9)
Let Tn be the time of the n-th group arrival, with Xn denoting the size of the batch
and Sni the service time of the ith member of the batch. The number of busy servers
at time is then
(6A.10)
N(t) = ∑ χ(t − Tn , Xn ),
0≤Tn ≤t
where
Xn
χ(t − Tn , Xn ) = ∑ I(t − Tn , Sni ),
i=1
I(t − Tn , Sni ) =
1 if t − Tn ≤ Sni
0 if t − Tn > Sni
(6A.11)
In other words, we are keeping track of all batches that arrived prior to time t and
the number of customers in each batch that haven’t yet exited the system. Introduce
the generating function
∞
G(z,t) =
∑ zk P[N(t) = k],
(6A.12)
k=0
and the binomial moments
∞
Br (t) =
k!
∑ (k − r)!r! P[N(t) = k],
r = 1, 2, . . . .
(6A.13)
k=r
Suppose that the system is empty at time t = 0. We shal derive an integral equation for the generating function G(z,t). Conditioning on the first arrival time by
setting T1 = y, it follows that N(t) = χ(t − y, X1 ) + N ∗ (t − y) if y ≤ t and zero otherwise, where χ(t − y, X1 ) and N ∗ (t − y) are independent of each other, and N ∗ (t) has
the same distribution as N(t). Moreover,
P[I(t − y, S1i ) = k] = P[t − y ≤ S1i ]δk,1 + P[t − y > S1i ]δk,0
= [1 − H(t − y)]δk,1 + H(t − y)δk,0 ,
so that
∞
∑ zk P[I(t − y, S1i ) = k] = z + (1 − z)H(t − y).
k=0
Since I(t − y, S1i ) for i = 1, 2, . . . are independent and identically distributed, the
theorem of total expectation implies that
6
E[zχ(t−y,X1 ) ] = E[E[zχ(t−y,X1 ) |X1 ]] =
b
∞
=
∑ pb ∏ E[z
b=1
∞
=
I(t−y,S1i )
∞
b
∑ pb E[z∑i=1 I(t−y,S1i ) ]
b=1
b
∞
]=
i=1
∞
!
k
∑ pb ∏ ∑ z P[I(t − y, S1i ) = k]
i=1
b=1
k=0
∑ pb [z + (1 − z)H(t − y)]b = A[z + (1 − z)H(t − y)].
b=1
Another application of the theorem of total expectation gives
G(z,t) = E[zN(t) ] = E[E[zN(t) |y]]
=
Z ∞
t
Z t
E[z0 ]dF(y) +
= 1 − F(t) +
Z t
0
0
E[zχ(t−y,X1 ) ]E[zN
∗ (t−y)
]dF(y)
G(z,t − y)A[z + (1 − z)H(t − y)]dF(y).
(6A.14)
One can now obtain an iterative equation for the Binomial moments by differentiating equation (6A.14 multiple times with respect to z and using
1 d r G(z,t) .
Br (t) =
r!
dzr z=1
Noting that
dr
= [1 − H(t − y)]r Ar ,
A[z + (1 − z)H(t − y)]
r
dz
z=1
r = 1, 2, . . .
we obtain the integral equation
Br (t) =
Z t
0
r
Ak
k=1 k!
Br (t − y)dF(y) + ∑
Z t
0
Br−k (t − y)[1 − H(t − y)]k dF(y). (6A.15)
This can be written in the more compact form
Br (t) =
Z t
0
Br (t − y)dF(y) +
where
Hr (t) =
Z t
0
Hr (t − y)dF(y),
(6A.16)
r
Ak
Br−k (t)[1 − H(t)]k .
k!
k=1
∑
(6A.17)
Equation (6A.16) is an example of a renewal-type equation, see equation (6A.30) of
Box S6A, which has the unique solution
Br (t) = Hr (t) +
Z t
0
where
7
Hr (t − y)dm(y),
(6A.18)
∞
m(y) =
Fn (y) = P[Tn ≤ y]
∑ Fn (y),
n=1
(6A.19)
is the so-called renewal function.
The steady-state Binomial moments can now be obtained by taking the limit
t → ∞ in equation (6A.18) and using the Key Renewal Theorem of Box S6A:
r
Br = λ
Ak
∑ k!
k=1
Z ∞
0
Br−k (t)[1 − H(t)]k dt,
(6A.20)
where λ −1 = E[T1 ]. In the case of an exponential service-time distribution, 1 −
H(t) = e−µt so that
r
Ak
Br = λ ∑ Bbr−k (kµ),
(6A.21)
k=1 k!
where Bbr (s) is the Laplace transform of Br (t). An iterative equation for the Laplace
transforms can be obtained by Laplace transforming equations (6A.16) and (6A.17),
and using the convolution theorem:
cr (s) fb(s),
Bbr (s) = Bbr (s) fb(s) + H
which, on rearranging, yields
and
Bbr (s) =
cr (s) =
H
fb(s) c
Hr (s),
1 − fb(s)
(6A.22)
r
Ak b
Br−k (s + kµ),
k=1 k!
∑
(6A.23)
Note that B0 (t) = 1 so Bb0 (s) = 1/s and, hence,
Bb1 (s) =
fb(s)
A1
.
1 − fb(s) s + µ
Equations (6A.21) and (6A.22) determine completely the steady-state Binomial
moments. In particular,
λ
B1 ≡ hNi = A1 ,
(6A.24)
µ
and
8
1
A2
hN 2 i − hNi = λ A1 Bb1 (µ) +
2
4µ
!
fb(µ)
A2
λ A1
A1 +
=
b
2µ 1 − f (µ)
2A1
B2 ≡
(6A.25)
Box S6A: The renewal equation [2].
A renewal process N(t) is defined according to
N(t) = max{n : Tn ≤ t}, T0 = 0,
Tn = X1 + . . . + Xn
(6A.26)
for n ≥ 1, where {X j } is a sequence of independent identically distributed
non-negative random variables. It immediately follows that
N(t) ≥ n if and only if Tn ≤ t.
Typically, N(t) represents the number of occurrences of some event in the
time interval [0,t]. In the case of queuing theory, this could be the number of
customer batches that have arrived for service. Thus Tn is the n-th arrival time
and Xn is the n-th inter-arrival time. In neurobiology, N(t) could represent the
number of times a neuron has fired an action potential in a time interval [0,t],
Tn is the n-th firing time, and Xn is the n-th inter-spike interval (see also Sec.
3.5). In the following we will assume that X j is strictly positive, which ensures
that P[N(t) < ∞] = 1 for all t.
Introduce the inter-arrival distribution F(t) = P[X1 ≤ t] and let Fk (t) =
P[Tk ≤ t] be the distribution function of the k-th arrival time Tk . Clearly F1 = F.
From the identity Tk+1 = Tk + Xk+1 , one has the iterative equation
Fk+1 (t) =
Z t
0
Fk (t − y)dF(y),
k ≥ 1.
(6A.27)
Moreover,
P[N(t) = k] = P[N(t) ≥ k] − P[N(t) ≥ k + 1] = P[Tk ≤ t] − P[Tk+1 ≤ t]
= Fk (t) − Fk+1 (t).
An important object in renewal theory is the renewal function m
∞
m(t) ≡ E[N(t)] =
∑ Fk (t).
(6A.28)
k=1
The last expression results by expressing N(t) as a sum of Heaviside functions
9
∞
N(t) =
∑ H(t − Tk ),
k=1
so that
m(t) = E
"
∞
#
∑ H(t − Tk )
k=1
∞
∞
∑ E[H(t − Tk )] = ∑ Fk (t).
=
k=1
k=1
A fundamental result of renewal theory is that m satisfies the renewal equation
Z t
m(t) = F(t) + m(t − x)dF(x).
(6A.29)
0
This can be established using the theorem of total expectation:
m(t) = E[N(t)] = E[E[N(t) | X1 ]]
=
=
=
Z ∞
0
E[N(t) | X1 = x]dF(x)
0
E[N(t) | X1 = x]dF(x) +
0
(1 + E[N(t − x)])dF(x) =
Z t
Z t
Z ∞
t
Z t
0
E[N(t) | X1 = x]dF(x)
(1 + m(t − x))dF(x).
We have used the fact that E[N(t) | X1 = x] = 0 if t < x. It can also be shown
that m(t) = ∑∞
k=1 Fk (t) is the unique solution to the renewal equation (6A.29)
that is bounded on finite intervals. An important generalization of (6A.29) is
the renewal-type equation
C(t) = H (t) +
Z t
0
C(t − x)dF(x),
(6A.30)
where H is a uniformly bounded function. The solution of the latter equation
is specified by the following theorem (see [2]):
Theorem 6A.1. The function
C(t) = H (t) +
Z t
0
H (t − y)dm(y)
(6A.31)
is a solution of the renewal-type equation (6A.30). Moreover, if H is bounded
on finite intervals then so is C, and the solution is unique.
Proof. If h : [0, ∞) → R, then define the functions h ∗ m and h ∗ F by
(h ∗ m)(t) =
Z t
0
h(t − x)dm(x),
(h ∗ F)(t) =
10
Z t
0
h(t − x)dF(x),
assuming the integrals exist. From the definitions, we have
m ∗ F = F ∗ m,
(h ∗ m) ∗ F = h ∗ (m ∗ F).
Moreover, the iterative equation (6A.27), the renewal equation (6A.29) and
the solution (6A.31) can be written in the compact forms
Fk+1 = Fk ∗ F = F ∗ Fk ,
C = H + H ∗ m.
m = F + F ∗ m,
Convolving C with respect to F gives
C∗F = H ∗F +H ∗m∗F
= H ∗ F + H ∗ (m − F)
= H ∗m =C−H ,
which establishes that the function C is a solution of the renewal-type equation
(6A.30). It remains to prove that if H is bounded on finite intervals then the
solution is unique.
If H is bounded on finite intervals [0, T ], then from equation (6A.31)
Z t
sup |C(t)| ≤ sup |H(t)| + sup H (t − y)dm(y)
0≤t≤T
0≤t≤T
0≤t≤T
0
≤ (1 + m(T )) sup |H(t)| < ∞.
0≤t≤T
Finiteness of the renewal function means that the solution C is also bounded
on finite intervals. Now suppose that Cb is another bounded solution of (6A.30)
b
and set ∆ (t) = C(t) − C(t).
Equations (6A.27) and (6A.30) imply that
∆ = ∆ ∗ F = ∆ ∗ F ∗ F = ∆ ∗ F2 = ∆ ∗ F ∗ F2 = ∆ ∗ F3 · · · ,
that is,
∆ = ∆ ∗ Fk ,
k ≥ 1.
Therefore,
|∆ (t)| ≤ Fk (t) sup |∆ (u)|,
0≤u≤t
k ≥ 0.
Finally, taking the limit k → ∞ and noting that
Fk (t) = P[N(t) ≥ k] → 0 as k → ∞,
we conclude that |∆ (t)| = 0 for all t.
11
We end our brief discussion of renewal theory by stating several limit theorems. Set µ = E[X1 ] < ∞.
Theorem 6A.2 (Elementary Renewal Theorem).
N(t) a.s. 1
→ as t → ∞.
t
µ
Theorem 6A.3 (Key Renewal Theorem). Define a random variable Y and
its distribution FY to be arithmetic with span λ , λ > 0, if Y takes values in
the set {mλ , m ∈ Z} with probability one, and λ is the maximal value with
this property. Suppose that the first inter-arrival time X1 is notR arithmetic. If
g : [0, ∞) → [0, ∞) is such that g is monotone decreasing and 0∞ g(t)dt < ∞,
then
Z
Z t
1 ∞
g(x)dx as t → ∞.
g(t − x)dm(x) →
µ 0
0
Supplementary references
1. Berger, S. L.: The complex language of chromatin regulation during transcription. Nature 447,
407-412 (2007)
2. Grimmett, G. R., Stirzaker, D. R.: Probability and random processes 3rd ed. Oxford University
Press, Oxford (2001).
3. Kumar, N., Singh, A., Kulkarni, R. V.: Transcriptional bursting in gene expression: analytical
results for general stochastic models. PLoS Comp. Biol. 11 e1004292 (2015)
4. Liu, L., Kashyap, B. R. K., Templeton, J. G. C.: On the GI X /G/∞ system. J. Appl. Prob., 27,
671-683 (1990).
5. Schwabe, A., Rybakova, K. N., Bruggeman, F. J.: Transcription stochasticity of complex gene
regulation models. Biophys J 103 1152-1161 (2011).
6. Weake, V. M., Worlkman, J. L.: Inducible gene expression: diverse regulatory mechanisms. Nat.
Rev. Genet. 11 426-437 (2010)
12
Download