Using the FSP to Compute Switch Rates

advertisement
Stochastic Gene Expression: Modeling, Analysis,
and Identification
Mustafa Khammash
io
at
n
C en
te
r
r
fo
y
mic
na
m
nd
sa
CC DC
Control, D
Comp
ut
University of California, Santa Barbara
al Sys
te
Munsky; q-bio
Stochastic Influences on Phenotype
Fingerprints of identical twins
Cc, the first cloned cat and her genetic mother
J. Raser and E. O’Shea, Science, 1995.
J. Raser and E. O’Shea, Science, 1995.
gen
variability in gene expression
Piliated
Elowitz et al, Science 2002
gen
gen
..
Unpiliated
gen
Modeling Gene Expression
γp
φ
protein
kp
γr
mRNA
kr
DNA
φ
Modeling Gene Expression
Deterministic model
γp
φ
protein
kp
γr
mRNA
kr
DNA
φ
Modeling Gene Expression
γp
φ
protein
kp
γr
mRNA
kr
DNA
φ
Modeling Gene Expression
Stochastic model
γp
φ
protein
• Probability a single mRNA is degraded in
time dt is (#mRN A) · γr dt
kp
γr
mRNA
kr
DNA
• Probability a single mRNA is transcribed in
time dt is kr dt.
φ
Modeling Gene Expression
Stochastic model
γp
φ
protein
• Probability a single mRNA is transcribed in
time dt is kr dt.
• Probability a single mRNA is degraded in
time dt is (#mRN A) · γr dt
kp
γr
mRNA
φ
kr
DNA
...
γp
φ
protein
Protein Molecules
Fluctuations at Small Copy Numbers
800
600
400
200
0
0
kp
DNA
φ
mRNA Molecules
kr
2000
3000
4000
5000
Time (s)
γr
mRNA
1000
25
20
15
10
5
0
0
1000
2000
3000
Time (s)
4000
5000
Fluctuations at Small Copy Numbers
γp
φ
(protein)
protein
kp
γr
mRNA
kr
DNA
(mRNA)
φ
Cv = coefficient of variation =
standard deviation
mean
Mass-Action Models Are Inadequate
400
Number of Molecules of P
350
stochastic
300
250
200
deterministic
150
100
50
0
10
20
30
40
50
60
70
80
90
Time
ks reduced by 50%
• Stochastic mean value different from deterministic steady state
• Noise enhances signal!
Johan Paulsson , Otto G. Berg , and Måns Ehrenberg, PNAS 2000
100
Stochastic Modeling: A Simple Example
γ
mRNA
φ
Transcription: Probability a single mRNA
is transcribed in time dt is kr dt
k
Degradation: Probability a single mRNA
is degraded in time dt is nγdt
DNA
k
N
mRNA copy number N (t) is a random variable
0
k
1
γ
k
.....
2
2γ
k
3γ
k
n
n−1
(n − 1)γ
k
nγ
n+1
(n + 1)γ
.....
k
k
k
k
k
k
Key Question:
0
1
γ
.
2
2γ
3γ
n
n−1
(n − 1)γ
nγ
n+1
.
(n + 1)γ
Find p(n, t), the probability that N (t) = n.
P (n, t + dt) = P (n − 1, t) · kdt
+ P (n + 1, t) · (n + 1)γdt
Prob.{N (t) = n − 1 and mRNA created in [t,t+dt)}
Prob.{N (t) = n + 1 and mRNA degraded in [t,t+dt)}
+ P (n, t) · (1 − kdt)(1 − nγdt) Prob.{N (t) = n and
mRNA not created nor degraded in [t,t+dt)}
P (n, t + dt) − P (n, t) = P (n − 1, t)kdt + P (n + 1, t)(n + 1)γdt − P (n, t)(k + nγ)dt
+O(dt2)
Dividing by dt and taking the limit as dt → 0
The Chemical Master Equation
d
P (n, t) = kP (n − 1, t) + (n + 1)γP (n + 1, t) − (k + nγ)P (n, t)
dt
mRNA Stationary Distribution
We look for the stationary distribution P (n, t) = p(n) ∀t
d P (n, t) = 0
The stationary solution satisfies: dt
From the Master Equation ...
(k + nγ)p(n) = kp(n − 1) + (n + 1)γp(n + 1)
n=0
kp(0) = γp(1)
n=1
kp(1) = 2γp(2)
n=2
kp(2) = 3γp(3)
...
kp(n − 1) = nγ p(n)
Poisson, a = 3
!#&$"
probability
!#&"
!#%$"
!#%"
!#!$"
!"
0
1
2
3
4
5
6
7
8
9
mRNA count
(n)
Stationary distribution:
n
a
P (n) = e−a
n!
k
a=
γ
Poisson Distribution
Formulation of Stochastic Chemical Kinetics
Reaction volume=Ω
Formulation of Stochastic Chemical Kinetics
Reaction volume=Ω
Key Assumptions
(Well-Mixed) The probability of finding any molecule in a region dΩ is
given by dΩ
Ω.
Formulation of Stochastic Chemical Kinetics
Reaction volume=Ω
Key Assumptions
(Well-Mixed) The probability of finding any molecule in a region dΩ is
given by dΩ
Ω.
(Thermal Equilibrium) The molecules move due to the thermal energy.
The reaction volume is at a constant temperature T . The velocity of a
molecule is determined according to a Boltzman distribution:
fvx (v) = fvy (v) = fvz (v) =
!
m
− 2km T v 2
B
e
2πkB T
Population: X(t) = [X1(t), . . . , XN (t)]T (integer r.v.)
• (M -reactions) The system’s state
can change through any one of
M reaction: Rµ : µ ∈ {1, 2, . . . , M }..
population of S2
Example:
R1
φ → S1
R2
S1 + S 2 → S1
S1 → φ
R3
1
2
3
6
4
5
7
8
• (State transition) An Rµ reaction causes a state transition
from x to x + sµ.
s1 =
!
"
1
; s2 =
0
!
0
−1
"
; s3 =
!
Stoichiometry matrix:
S=
!
s1 s2 . . . s M
population of S1
• (Transition Probability) Probability that Rµ reaction will occur
in the next dt time units is: wµ(x)dt
Example: w1(x) = c1; w2(x) = c2 · x1x2; w3(x) = c3x1;
"
−1
0
"
Characterizing X(t)
X(t) is Continuous-time discrete-state Markov Chain
Sample Path Representation:
X(t) = X(0) +
M
!
k=1
sk Yk
"# t
0
wk (X(s))ds
$
Yk [·] are independent unit Poisson
The Chemical Master Equation (Forward Kolmogorov Equation)
!
!
dp(x, t)
wk (x) +
p(x − sk , t)wk (x)
= −p(x, t)
dt
k
k
p(x, t) := prob(X(t) = x)
From Stochastic to Deterministic
Define X Ω(t) = X(t)
Ω .
Question: How does X Ω(t) relate to Φ(t)?
Fact: Let Φ(t) be the deterministic solution to the reaction rate equations
dΦ
= Sf (Φ), Φ(0) = Φ0.
dt
Let X Ω(t) be the stochastic representation of the same chemical systems with X Ω(0) = Φ0. Then for every t ≥ 0:
!
!
! Ω
!
lim sup !X (s) − Φ(s)! = 0 a.s.
Ω→∞ s≤t
Simulation and Analysis Tools
•
•
•
•
Sample Paths Computations
Moment Computation
SDE Approximation
Density Computations
1. Sample Paths Computation
Gillespie’s Stochastic Simulation Algorithm:
To each of the reactions {R1 , . . . , RM } we associate a RV τi :
τi is the time to the next firing of reaction Ri
Fact 0: τi is exponentially distributed with parameter wi
We define two new RVs:
τ = min{τi }
i
µ = arg min{τi}
i
(Time to the next reaction)
(Index of the next reaction)
Fact 1: τ is exponentially distributed with parameter
wk
Fact 2: P (µ = k) = !
w
i i
!
i
wi
Stochastic Simulation Algorithm
• Step 0 Initialize time t and state population x
• Step 1 Draw a sample τ from the distribution of τ
1
Cumulative distribution of τ : F (t) = 1 − exp(−
r1 ∈ U ([0, 1])
!
k
wk t)
1
τ = ! 1w log 1−r
1
k k
0
time (s)
• Step 2 Draw a sample µ from the distribution of µ
Cumulative distribution of µ
1
!
(w1 + w2 + w3 + w4)/ k wk
!
(w1 + w2 + w3)/ k wk
r2 ∈ U ([0, 1])
µ
(w1 + w2)/
w1 /
0
1
2
3
4
reaction index
5
• Step 3 Update time: t ← t + τ . Update state: x ← x + sµ .
!
k wk
!
k wk
Stochastic Simulation Algorithm: Matlab code
clear all
t=0;tstop = 2000;
x = [0; 0];
S = [1 -1 0 0; 0 0 1 -1];
w = inline('[10, 1*x(1), 10*x(1), 1*x(2)]','x');
while t<tstop
a = w(x);
w0 = sum(a);
%
t = t+1/w0*log(1/rand);
if t<=tstop
r2w0=rand*w0;
% generate second
i=1;
while sum(a(1:i))<r2w0
% increment
i=i+1;
end
x = x+S(:,i);
end
end
%%specify initial and final times
%% Specify initial conditions
%% Specify stoichiometry
%% Specify Propensity functions
% compute the prop. functions
compute the sum of the prop. functions
% update time of next reaction
random number and multiply by prop. sum
% initialize reaction counter
counter until sum(a(1:i)) exceeds r2w0
% update the configuration
2. Moment Computations
Let w(x) = [w1 (x), . . . , wM (x)]T be the vector of propensity functions
Moment Dynamics
dE[X]
= S E[w(X)]
dt
dE[XX T ]
= SE[w(X)X T ] + E[XwT (X)]S T + S diag(E[w(X)]) S T
dt
• Affine propensity. Closed moment equations.
• Quadratic propensity. Not generally closed.
– Mass Fluctuation Kinetics (Gomez-Uribe, Verghese)
– Derivative Matching (Singh, Hespanha)
Affine Propensity
Suppose the propensity function is affine:
w(x) = W x + w0,
(W is N × N , w0 is N × 1)
Then E[w(X)] = W E[X]+w0, and E[w(X)X T ] = W E[XX T ]+w0E[X T ].
This gives us the moment equations:
d
E[X] = SW E[X] + Sw0
dt
First Moment
d
E[XX T ] = SW E[XX T ] + E[XX T ]W T S T + S diag(W E[X] + w0)S T
dt
+ Sw0E[X T ] + E[X]w0T S T
Second Moment
These are linear ordinary differential equations and can be easily solved!
Application to Gene Expression
Reactants
X1(t) is # of mRNA; X2(t) is # of protein
Reactions
γp
φ
kr
R1 : φ −→ mRN A
γr
R2 : mRN A −→ φ
protein
kp
R3 : mRN A −→ protein + mRN A
γp
kp
γr
mRNA
kr
DNA
R4 : protein −→ φ
φ
Stoichiometry and Propensity
"
!
1 −1 0 0
S=
0 0 1 −1






k
0 0
kr
 r 


'

(
γ X 
γ
0

 r 1
 r 0  X1
 
w(X) = 
+ 
=

kpX1
kp 0  X2
0



 

γpX2
0 γp
0
W
w0
Steady-State Moments
!
"
−γr 0
A = SW =
,
kp −γp
!
kr
Sw0 =
0
"


kr
 γr 


X̄ = −A−1Sw0 = 

k k 
p r
γp γr
Steady-State Covariance


2kr
0
2kp kr 
0
γr
BB T = S diag(W X̄ + w0)S T = 
The steady-state covariances equation
AΣ̄ + Σ̄AT + BB T = 0
can be solved algebraically for Σ̄.



Σ̄ = 


kr
γr
kp kr
γr (γr +γp )
kp kr
γr (γr +γp )
kp kr
kp
(1
+
γp γr
γr +γp )






Lyapunov Equation
3. SDE Approximation
Let X (t) :=
Ω
X(t)
Ω
Write X Ω = Φ0(t) + √1 V Ω where Φ0(t) solves the deterministic RRE
Ω
dΦ
= Sf (Φ)
dt
Linear Noise Approximation
V Ω(t) → V (t) as Ω → ∞,
d[Sf (Φ)]
A(t) =
(Φ0(t)),
dΦ
where dV (t) = A(t)V (t)dt + B(t)dWt
!
B(t) := S diag[f (Φ0(t))]
Linear Noise Approximation: X Ω(t) ≈ Φ(t) + √1 V (t)
Ω
Linear Noise Approximation: Stationary Case
Multiplying X Ω(t) ≈ Φ̄ + √1 V (t) by Ω, we get
Ω
X(t) ≈ ΩΦ̄ +
deterministic
√
ΩV (t)
zero mean
stochastic
E[X(t)] = ΩΦ̄
Let Σ̄ be the steady-state covariance matrix of
AΣ̄ + Σ̄AT + ΩBB T = 0
√
Ω · V (t). Then
(white gaussian noise)
ω
Ẏ = AY +
√
ΩB ω
Y (t) =
√
ΩV (t)
X(t)
Ωφ̄ (mean)
+
4. Density Computation
Form the probability density state vector
The Chemical Master Equation (CME):
can now be written in matrix form:
:
The Finite State Projection Approach
The Finite State Projection Approach
•
A finite subset is appropriately
chosen
The Finite State Projection Approach
•
A finite subset is appropriately
chosen
•
The remaining (infinite) states are
projected onto a single state (red)
The Finite State Projection Approach
•
A finite subset is appropriately
chosen
•
The remaining (infinite) states are
projected onto a single state (red)
•
Only transitions into removed
states are retained
The projected system can be solved exactly!
Finite Projection Bounds
Theorem [Projection Error Bounds] Consider any Markov
process described by the Forward Kolmogorov Equation:
Ṗ(XJ ; t) = A · P(XJ ; t).
If for an indexing vector J: 1T exp(AJ T )P(XJ ; 0) ≥ 1 − !, then
!"
# "
#!
!
! P(X ; t)
exp(A
t)
P
(X
;
0)
!
!
J
J
J
−
!
! <!
!
! P(XJ # ; t)
0
1
Munsky B. and Khammash M., Journal of Chemical Physics, 2006
t ∈ [0, T ]
Applications of FSP
•
•
•
•
Feedback Analysis
Synthetic Switch Analysis
Epigenetic Switch Analysis
System Identification
Application: Noise Attenuation through Feedback
0.045
γp
γp
φ
protein
φ
0.04
protein
kp
p
γr
mRNA
feedback
0.03
γr
φ
mRNA
k1 = 0.2
more feedback
no feedback
k
0.035
φ
0.025
0.02
kr
k0 − k1 · (# protein)
DNA
0.015
DNA
k1 = 0.1
k1 = 0.05
k1 = 0
k1 = −0.05
0.01
0.005
µ∗p
=
Mean
Variance
!
"
b
+ 1 µ∗p
1+η
=
µ∗p
Variance
!
"
1−φ
b
·
+ 1 µ∗p
1 + bφ 1 + η
k1
kp
γp
where φ =
, b= , η=
γp
γr
γr
0
0
50
100
γp = γr = 1
<1
Thattai, van Oudenaarden
Protein variance is always smaller with negative feedback!
150
kp = 10;
200
Analysis of Stochastic Switchs
s2 v
Two repressors, u and v.
s2 Gene
Gardner, et al., Nature 403, 339-342 (2000)
s2 Promoter
s1 Promoter
s1
u
s1
Gene
Analysis of Stochastic Switchs
s2 v
Two repressors, u and v.
s2 Gene
Gardner, et al., Nature 403, 339-342 (2000)
s2 Promoter
s1 Promoter
s1
u
v inhibits the production !of u:"
1
0
α1
a1 (u, v) =
ν1 =
β
1+v
u inhibits the production of v:
α2
a3 (u, v) =
1 + uγ
ν3 =
!
0
1
"
u and v degrade exponentially:
a2 (u, v) = u ν2 =
!
a4 (u, v) = v ν4 =
!
−1
0
"
0
−1
"
s1
Gene
Analysis of Stochastic Switchs
s2 v
Two repressors, u and v.
s2 Gene
Gardner, et al., Nature 403, 339-342 (2000)
s2 Promoter
s1 Promoter
s1
Gene
s1
u
v inhibits the production !of u:"
1
0
α1
a1 (u, v) =
ν1 =
β
1+v
u inhibits the production of v:
α2
a3 (u, v) =
1 + uγ
ν3 =
!
0
1
"
u and v degrade exponentially:
a2 (u, v) = u ν2 =
!
a4 (u, v) = v ν4 =
!
−1
0
"
0
−1
"
α1 = 50
β = 2.5
α2 = 16
γ=1
u(0) = v(0) = 0
Modeling of a DAM Epigenetic Switch using FSP
Presented section contains unpublished data and is not included
in the online version
Using Noise to Identify Model Parameters
Presented section contains unpublished data and is not included in
the online version
Conclusions
Conclusions
• Fluctuations may be very important
Conclusions
• Fluctuations may be very important
• Cell variability
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
• Cellular noise reveals network parameters and enables model identification
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
• Cellular noise reveals network parameters and enables model identification
• Stationary moments are not sufficient for full identifiability
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
• Cellular noise reveals network parameters and enables model identification
• Stationary moments are not sufficient for full identifiability
• Small number of transient measurements of noise is sufficient for identifiability
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
• Cellular noise reveals network parameters and enables model identification
• Stationary moments are not sufficient for full identifiability
• Small number of transient measurements of noise is sufficient for identifiability
• Finite State Projection allows the use of master equation solution for
identification
Conclusions
• Fluctuations may be very important
• Cell variability
• Cell fate decisions
• Some tools are available
• Monte Carlo simulations (SSA and variants)
• Moment approximation methods
• Linear noise approximation (Van Kampen)
• Finite State Projection
• Cellular noise reveals network parameters and enables model identification
• Stationary moments are not sufficient for full identifiability
• Small number of transient measurements of noise is sufficient for identifiability
• Finite State Projection allows the use of master equation solution for
identification
• Cellular noise (process noise) vs. measurement noise (output noise)
Acknowledgement • Brian Munsky, UCSB (now at LANL)
FSP, Pap Switch, and ID with Noise
• David Low, UCSB
Pap Switch
• Brooke Trinh, UCSB
Pap Switch, ID with Noise
Download