STOCHASTIC CHEMICAL KINETICS Dan Gillespie Dan T Gillespie Consulting GillespieDT@mailaps.org Current Support: Caltech (NIGMS) Caltech (Beckman Institute / BNMC) University of California at Santa Barbara (DOE) Past Support: Molecular Sciences Institute (Sandia / DOE) Caltech (DARPA / AFOSR) ONR A Chemically Reacting System consists of … • Molecules of N chemical species S1 ,… , S N . - Inside a volume Ω , at some temperature T . • M “elemental” reaction channels R1 ,… , RM . - We assume R j to be a single instantaneous physical event that changes the population of at least one species. - In practice, “elemental” means that R j must be one of two types: Unimolecular: Si → something else , or Bimolecular: Si + Si′ → something else . - All other types of reaction (trimolecular, reversible, etc.) are made up of a series of two or more elemental reactions. E.g.: S + S S12 S1 + S2 + S3 → S 4 + S5 is typically 1 2 S12 + S3 → S4 + S5 1 Question: How does a spatially homogeneous (or well-stirred) chemically reacting system evolve in time? The Trad Answer: According to the reaction rate equation (RRE). • A set of coupled, first-order ODEs. • Derived using ad hoc, phenomenological reasoning. o Is more than the “mass action equations” of thermodynamics, which apply only to systems in equilibrium. • Implies the system evolves continuously and deterministically, even though molecules come in integer numbers and react stochastically. • Is empirically accurate for large (test tube size) systems, but is sometimes not adequate for very small (cell-size) systems. *** This is a “hand-wavy” answer to an important question. Doing it “right”: Molecular Dynamics • The most exact way of describing the system’s evolution. • The “motion picture” approach: Tracks the position and velocity of every molecule in the system. • Simulates every collision, non-reactive as well as reactive. • Shows changes in species populations and their spatial distributions. • But . . . it’s unfeasibly slow for nearly all realistic systems. 2 A great simplification occurs if successive reactive collisions tend to be separated in time by very many non-reactive collisions. • The overall effect of the non-reactive collisions is to randomize - the velocities of the molecules (Maxwell-Boltzmann distribution). - the positions of the molecules (spatially uniform or well-stirred), • Then, instead of having to describe the system’s state as the position, velocity and species of each molecule, we need only give X(t ) ( X 1 (t ),… , X N (t ) ) , X i (t ) the number of Si molecules at time t . But this well-stirred simplification, which . . . • ignores the non-reactive collisions, • drastically truncates the definition of the system’s state, . . . comes at a price: X(t ) must now be viewed as a stochastic process. But in fact, the system was never deterministic to begin with. Even if molecules moved according to classical mechanics . . . - Unimolecular reactions always involve randomness (QM). - Bimolecular reactions usually do too. - A system of many colliding molecules is so sensitive to initial conditions that, for all practical purposes, it evolves “randomly”. - The system is not isolated. It’s in a heat bath, which keeps it “at temperature T ” – via essentially random interactions. 3 For well-stirred systems, each R j is completely characterized by … • a propensity function a j ( x) : Given the system is in state x , a j ( x) dt probability that one R j event will occur in the next dt . - The existence and form of a j ( x) follow from molecular physics. ( ) • a state change vector ν j ≡ ν1 j ,… , vN j : ν i j the change in X i caused by one R j event. { } - R j induces x → x + ν j . ν i j ≡ the “stoichiometric matrix.” Examples: ( ) j S1 → S2 : ν j = ( −1,1, 0,… , 0 ) ; a j ( x) dt = c j dt × x1 ⇒ a j ( x) = c j x1 c ( c ) j S1 + S2 → 2S 2 : same ν j ; a j (x) dt = c j dt × x1 x2 ⇒ a j ( x) = c j x1 x2 Prob {v12 -collision in dt} = (π r122 )(v12 dt ) Ω { } . Prob R j | v12 -collision p j (v12 ). (π r122 )(v12 dt ) π r122 v12 p j (v12 ) x1 x2 dt = c j x1 x2 dt × p j (v12 ) × x1 x2 = v 12 Ω Ω v12 a j (x ) cj Prob that a randomly chosen S1-S2 pair does an R j in next dt R j iff "collisional K .E." > E j ⇒ v12 p j (v12 ) v12 = Ej 8 k BT exp − π m12 k BT Arrhenius v12 4 Two exact, rigorously derivable consequences . . . 1. The chemical master equation (CME): ∂P (x, t | x0 , t0 ) M = ∑ a j ( x − ν j ) P (x − ν j , t | x 0 , t0 ) − a j (x) P(x, t | x0 , t0 ) . ∂t j =1 • P(x, t | x0 , t0 ) Prob {X(t ) = x, given X(t0 ) = x0 } for t ≥ t0 . • Follows from the probability statement M P ( x, t + dt | x0 , t0 ) = P(x, t | x0 , t0 ) × 1 − ∑ a j (x) dt j =1 ( M ) ( ) + ∑ P (x − ν j , t | x0 , t0 ) × a j ( x − ν j )dt . j =1 • But the CME is usually too hard to solve. • Averages: f ( X(t ) ) ∑ f (x) P(x, t | x0 , t0 ) . x If we multiply the CME through by x and then sum over x, we find M d X(t ) = ∑ ν j a j ( X(t ) ) . dt j =1 • If there were no fluctuations, a j ( X(t ) ) = a j ( X(t ) ) = a j ( X(t ) ) , and the above would reduce to: dX(t ) M = ∑ ν j a j ( X(t ) ) . dt j =1 - This is the reaction-rate equation (RRE). - It’s usually written in terms of the concentration Z(t ) X(t ) Ω . But as yet, we have no justification for ignoring fluctuations. 5 2. The stochastic simulation algorithm (SSA): A procedure for constructing sample paths or realizations of X(t ) . Idea: Generate properly distributed random numbers for - the time τ to the next reaction, - the index j of that reaction. • p (τ , j | x, t ) dτ probability, given X(t ) = x , that the next reaction will occur in [t +τ , t +τ + dτ ) , and will be R j . = P0 (τ ) × a j (x)dτ , P0 (τ ) Pr(no reactions in time τ ) . P0 (τ + dτ ) = P0 (τ ) × (1 − a0 ( x) dτ ) , where a0 (x) ∑1 a j′ (x) . M Implies dP0 (τ ) = −a0 (x) P0 (τ ). dτ Solution: P0 (τ ) = e ∴ p (τ , j | x, t ) = e− a0 ( x )τ a j ( x) = a0 (x) e − a0 ( x )τ × − a 0 ( x )τ a j (x) a0 (x) . . Thus, - τ is an exponential random variable with mean 1 a0 (x) , - j is an integer random variable with probabilities a j (x) a0 ( x) . The “Direct” Version of the SSA M 1. In state x at time t, evaluate a1 ( x),… , aM (x) , and a0 (x) ≡ ∑ a j ′ ( x) . j ′=1 2. Draw two unit-interval uniform random numbers r1 and r2 , and compute τ and j according to 1 1 • τ= ln , a0 ( x) r1 j • j = the smallest integer satisfying ∑ ak (x) > r2 a0 (x) . k =1 3. Replace t ← t + τ and x ← x + ν j . 4. Record (x, t ) . Return to Step 1, or else end the simulation. 6 c1 A Simple Example: S1 →0 . a1 ( x1 ) = c1 x1 , ν 1 = −1 . Take X 1 (0) = x10 . RRE: dX1 (t ) = −c1 X 1 (t ) . Solution is X1 (t ) = x10 e −c1t . dt CME: ∂P( x1 , t | x10 , 0) = c1 ( x1 + 1) P( x1 + 1, t | x10 , 0) − x1P( x1 , t | x10 , 0) . ∂t Solution: P( x1 , t | x10 , 0) = x10 ! e − c1x1t 1 − e − c1t x1 !( x10 − x1 )! ( ) x10 − x1 ( x1 = 0,1,… , x10 ) ( ) X1 (t ) = x10 e − c1t , sdev { X1 (t )} = x10 e − c1t 1 − e − c1t . which implies SSA: Given X1 (t ) = x1 , generate τ = 1 1 ln , then update: c1 x1 r t ← t + τ , x1 ← x1 − 1 . 100 S1 → 0 80 c1 = 1, X1(0) = 100 60 40 20 0 0 1 2 3 4 5 6 7 a j (x) dt Prob that R j will fire in next dt CME • • • • SSA The SSA . . . Is exact. It does not entail approximating “dt” by “ ∆t ”. Is procedurally simple, even when the CME is intractable. Comes in a variety of implementations … - Direct Method (Gillespie, 1976) - First Reaction Method (Gillespie, 1976) - Next Reaction Method (Gibson & Bruck, 2000) - First Family Method (Lok, 2003) - Modified Direct Method (Cao, Li & Petzold, 2004) - Sorting Direct Method (McCollum, et al. 2006) Remains too slow for most practical problems: Simulating every reaction event one at a time just takes too much time if any reactants are present in very large numbers. 8 We would be willing to sacrifice a little exactness . . . . . . if that would buy us a faster simulation. Tau-Leaping • Approximately advances the process by a pre-selected time τ , which may encompass more than one reaction event. • Key: The definition of “the Poisson random variable with mean aτ ”: P ( aτ ) the number of events that will occur in a time τ , given that the probability of an event in any dt is adt where a can be any positive constant. • With X(t ) = x , let us choose τ small enough to satisfy the Leap Condition: Each a j (x) ≈ constant in [t , t +τ ] . ( ) • Then: The number of R j firings in [t , t +τ ] ≈ P a j ( x)τ . M ( ) ( ) X(t + τ ) x + ∑ Pj a j (x)τ ν j j =1 M X(t + τ ) x + ∑ Pj a j ( x)τ ν j j =1 - Practical Considerations for Implementing Tau-Leaping • Finding a τ that satisfies the Leap Condition. - Accomplished via a control parameter ε (e.g., ε = 0.04 ). - We compute the largest τ for which the mean and standard deviation of ∆τ a j a j are both ≤ ε for all j = 1,… , M . • Avoiding negative populations (which might arise when several reactions independently consume the same reactant). - Accomplished via a second control parameter nc (e.g., nc = 10 ). - We call any reaction that is within nc firings of exhausting any reactant a critical reaction. Then we take care not to leap over more than one firing of all the critical reactions. • The resulting procedure segues efficiently to the SSA when all reactions become critical (or when nc → ∞ ). 9 50000 monomer X1, unstable dimer X2, stable dimer X3 45000 S1--> 0 S1 + S1 --> S2 S2 --> S1 + S1 S2 --> S3 40000 35000 c1 = 1 c2 = 0.002 c3 = 0.5 c4 = 0.04 - Exact SSA Run. - Intially: X1=100,000; X2=X3=0. - 500 reactions per plotted dot - 517,067 reactions total. 30000 X2 25000 20000 15000 X3 10000 5000 X1 0 0 2 4 6 8 10 12 14 16 18 20 22 24 20 22 24 Time 50000 monomer X1, unstable dimer X2, stable dimer X3 45000 S1--> 0 S1 + S1 --> S2 S2 --> S1 + S1 S2 --> S3 40000 35000 c1 = 1 c2 = 0.002 c3 = 0.5 c4 = 0.04 - Explicit Tau Leaping Run - ε = 0.04; n c = 10. - 1 leap per plotted dot. - 905 leaps total. - Run time speedup over SSA > 10X. 30000 X2 25000 20000 15000 X3 10000 5000 X1 0 0 2 4 6 8 10 12 14 16 18 Time 10 a j (x) dt Prob that R j will fire in next dt { } − a j (x) ≈ const over τ , ∀j CME SSA Tau-Leaping Discrete & Stochastic Speeding up Tau-Leaping: The Langevin Equation • Two math facts: - If m 1 , then P (m) ≈ N ( m, m) . - N (m, σ 2 ) = m + σ N (0,1) . • So, with X(t ) = x , suppose we can choose τ small enough to satisfy the Leap Condition, yet also large enough that a j (x)τ 1, ∀j . M Then . . . ( ) X(t + τ ) x + ∑ Pj a j (x)τ ν j j =1 M ( ) x + ∑ N j a j ( x)τ , a j ( x)τ ν j j =1 M x + ∑ a j (x)τ + a j ( x)τ N j (0,1) ν j j =1 M M j =1 j =1 X(t + τ ) x + ∑ ν j a j ( x )τ + ∑ ν j a j ( x ) N j (0,1) τ . 11 M M j =1 j =1 X(t + τ ) x + ∑ ν j a j ( x )τ + ∑ ν j a j ( x ) N j (0,1) τ • This is the Langevin leaping formula. • It’s faster than the ordinary tau-leaping formula, because - a j ( x )τ 1 means lots of reaction events get leapt over in τ ; - normal random numbers can be generated faster than Poissons. • It directly implies, and is entirely equivalent to, a SDE called the chemical Langevin equation (CLE): M dX(t ) M ∑ ν j a j ( X(t ) ) + ∑ ν j a j ( X(t ) ) Γ j (t ) . dt j =1 j =1 - Gaussian white noise: Γ (t ) lim+ dt →0 N (0,1) 1 ≡ lim+ N 0, . dt →0 dt dt - Satisfies Γ j (t ) Γ j′ (t ′) = δ j j′ δ (t − t ′) . • Our discrete stochastic process X(t ) has now been approximated as a continuous stochastic process. a j (x) dt Prob that R j will fire in next dt { } − a j (x) ≈ const over τ , ∀j CME SSA Tau-Leaping { Discrete & Stochastic } − a j (x)τ 1, ∀j CFPE * * CLE Continuous & Stochastic - J. Chem. Phys. 113:297 (2000) - Am. J. Phys. 64:1246 (1996) - J. Phys. Chem. A 106:5063 (2002) 12 The Thermodynamic Limit Def: All X i → ∞ , and Ω → ∞ , with X i Ω constants. • a j = c j x1 ∼ x1 In the thermodynamic limit, ⇒ a j = c j x1 x2 ∼ Ω x1 x2 ∼ x2 all a j 's grow like (system size). −1 • So in the thermodynamic limit, we see that in the CLE M dX(t ) M ∑ ν j a j ( X(t ) ) + ∑ ν j a j ( X(t ) ) Γ j (t ) , dt j =1 j =1 - the deterministic term grows like (system size), 1/2 - the stochastic term grows like (system size) . –1/2 • ⇒ Rule of Thumb: Relative fluctuations die off as (system size) . • At the thermodynamic limit the stochastic term disappears, leaving dX(t ) M ∑ ν j a j ( X(t ) ) … the RRE … derived! dt j =1 X(t ) has now become a continuous deterministic process. a j (x) dt Prob that R j will fire in next dt { } − a j (x) ≈ const over τ , ∀j CME SSA Tau-Leaping { Discrete & Stochastic } − a j (x)τ 1, ∀j CFPE CLE Continuous & Stochastic X → ∞, Ω → ∞ − i X i Ω = const i , ∀i RRE Continuous & Deterministic 13 A Spectrum of Descriptive Mathematical Modes (for well-stirred systems) a j (x) ≈ const in [t , t +τ ], ∀j a j ( x) τ 1 ∀j X i →∞, Ω →∞ X i Ω → const i CME/SSA → Tau-Leaping → CLE / CFPE → RRE exact approximate approximate approximate discrete stochastic discrete stochastic continuous stochastic continuous deterministic increasing numbers of molecules Another Multi-Scale Problem • • • • • Some reactions/species may be very fast, others very slow. “Fast” and “slow” are interconnected – not easy to separate. Often manifests as dynamical stiffness, a known ODE problem. SSA still works, and is exact. But it’s agonizingly slow. Tau-leaping remains accurate, but the Leap Condition restricts τ to the shortest (fastest) time scale of the system. So even it’s too slow. • One approach: Implicit Tau-Leaping - A stochastic adaptation of the implicit Euler method for ODEs. • Another approach: The Slow-Scale Stochastic Simulation Algorithm - Skips over the fast reactions and simulates only the slow reactions, using specially modified propensity functions. An adaptation of the “rapid equilibrium” / “quasi steady-state” methods for RREs. 14 Collaborators and Associates Linda Petzold (UCSB) John Doyle (Caltech) Michael Hucka (Caltech & Beckman BNMC) Muruhan Rathinam (UMBC) Yang Cao (Virginia Tech) Sotiria Lampoudi (UCSB) Hong Li (UCSB) 15