The Annals of Applied Probability 2014, Vol. 24, No. 2, 721–759 DOI: 10.1214/13-AAP934 © Institute of Mathematical Statistics, 2014 CENTRAL LIMIT THEOREMS AND DIFFUSION APPROXIMATIONS FOR MULTISCALE MARKOV CHAIN MODELS1 B Y H YE -W ON K ANG , T HOMAS G. K URTZ2 AND L EA P OPOVIC3 Ohio State University, University of Wisconsin and Concordia University Ordinary differential equations obtained as limits of Markov processes appear in many settings. They may arise by scaling large systems, or by averaging rapidly fluctuating systems, or in systems involving multiple timescales, by a combination of the two. Motivated by models with multiple timescales arising in systems biology, we present a general approach to proving a central limit theorem capturing the fluctuations of the original model around the deterministic limit. The central limit theorem provides a method for deriving an appropriate diffusion (Langevin) approximation. 1. Introduction. There are two classical kinds of Gaussian limit theorems associated with continuous time Markov chains as well as more general Markov processes. The first of these considers a sequence {XN } of Markov chains that converges to a deterministic function X and gives a limit for the rescaled deviations U N = rN (X N − X); see, for example, Kurtz (1971), Kurtz (1977/78), van Kampen (1961). The second considers an ergodic Markov process Y with stationary distribution π and gives a limit for Nt √ t 1 h Y (s) ds = N h Y (Ns) ds Z N (t) = √ 0 N 0 for h satisfying h dπ = 0; see, for example, Bhattacharya (1982) for a general result of this type. There are many proofs for theorems like these. In particular, results of both types can be proved using the martingale central limit theorem (Theorem A.1). For example, in the first case, there is typically a sequence of functions F N such that M N (t) = X N (t) − X N (0) − t F N X N (s) ds 0 Received August 2012; revised March 2013. 1 Supported in part by NSF FRG Grant DMS 05-53687. 2 Supported in part by NSF Grant DMS 11-06424. 3 Supported by NSERC Discovery grant. MSC2010 subject classifications. 60F05, 92C45, 92C37, 80A30, 60F17, 60J27, 60J28, 60J60. Key words and phrases. Reaction networks, central limit theorem, martingale methods, Markov chains, scaling limits. 721 722 H.-W. KANG, T. G. KURTZ AND L. POPOVIC is a martingale, F N → F , Ẋ = F (X) and F N (x N ) − F (x) ≈ ∇F (x)(x N − x), for x N converging to x. If the martingale central limit theorem gives rN M N ⇒ M and U N (0) ⇒ U (0), then (ignoring technicalities) U N should converge to the solution of U (t) = U (0) + M(t) + (1.1) t 0 ∇F X(s) U (s) ds. In the second case, the assumption that h dπ = 0 suggests that there should be a solution of the Poisson equation Af = −h, where A is the generator for Y , and then Nt 1 N f Y (Nt) − f Y (0) − Af Y (s) ds Z (t) = √ 0 N 1 − √ f Y (Nt) − f Y (0) . N The first term on the right is a martingale and the second should go to zero, so if the martingale central limit theorem applies to the first, then Z N should converge. This paper addresses situations of the first type [V0N ⇒ V0 for a deterministic V0 , and we want to verify convergence of U N = rN (V0N − V0 )] in which both approaches are required. Specifically, the function F N giving the martingale, M N , depends not only on V0N but also on another process V1N [think V1N (t) = V1 (Nt)], so M N,1 is a martingale, (t) = V0N (t) − V0N (0) − FN t 0 F N V0N (s), V1N (s) ds “averages” to F in the sense that t N N F V0 (s), V1N (s) − F V0N (s) ds → 0, 0 (V0N , V1N ) is Markov with generator AN and there exist HN (F N − F ). [Note that HN will be a vector of functions in D(AN ).] Assuming that such that AN HN ≈ the domain of AN , M N,2 (t) = HN V0N (t), V1N (t) − HN V0N (0), V1N (0) − t AN HN V0N (s), V1N (s) ds 0 is a martingale, and again ignoring all the technicalities, we have rN V0N (t) − V0 (t) = rN V0N (0) − V0 (0) + rN M N,1 (t) − rN M N,2 (t) (1.2) + t 0 rN F V0N (s) − F V0 (s) ds 723 CLT FOR MULTI-SCALE MARKOV CHAINS + rN HN V0N (t), V1N (t) − HN V0N (0), V1N (0) + rN t N N F V0 (s), V1N (s) − F V0N (s) 0 − AN HN V0N (s), V1N (s) ds. If the last two terms on the right go to zero, the martingale terms converge, rN M N,1 − rN M N,2 ⇒ M and F is smooth, then we again should have U N ⇒ U satisfying (1.1). The work to be done to obtain theorems of this type is now clear. We need to identify F N and F , find an approximate solution to the Poisson equation AN HN ≈ F N − F , verify that the martingales satisfy the conditions of the martingale central limit theorem, and verify that the error terms [the last two terms in (1.2)] converge to zero. We will make this analysis more specific in stages. We are essentially considering situations in which the process V1N is evolving on a faster time scale than V0N and “averages out” to give the convergence of V0N to V0 . But V1N itself may evolve on more than one time scale. In the first stage of our development, we will replace V1N by (V1N , V2N ) with V1N and V2N evolving on different (fast) time scales. Once the analysis for two fast time scales is carried out, the extension of the general results to more than two fast time scales should be clear. In the second stage, we consider multiply scaled, continuous-time Markov chains of a type that arises naturally in models of chemical reaction networks. For these models, many of the conditions simplify, but the notation becomes more complex. Outline: In Section 2 we state and prove the functional central limit Theorem 2.11, and specify a sequence of Conditions 2.1–2.10 that need to be verified for it to apply. In Section 3 we additionally give a diffusion approximation implied by Theorem 2.11. Our aim is to apply these results to Markov chain models for chemical reactions. In Section 4 we identify specific aspects of the multi-scale behavior of a reaction network that one needs in order to apply Theorem 2.11 to the chemical species with a deterministic limit on the slowest time scale. Section 5 provides several examples of chemical networks (the first two evolving on two, the last one on three time-scales), and shows how to verify the conditions and obtain a diffusion approximation. 2. A central limit theorem for a system with deterministic limit and three time scales. We identify a set of conditions on a three time-scale process V N = (V0N , V1N , V2N ) that guarantee U N = rN (V0N − V0 ) converges to a diffusion. As suggested earlier, we write U N in the form U N (t) = U N (0) + rN M N,1 (t) − M N,2 (t) (2.1) + rN t N F V0 (s) − F V0 (s) ds 0 724 H.-W. KANG, T. G. KURTZ AND L. POPOVIC + rN t N N F V (s) − F V N (s) ds 0 t N F V (s) − F V0N (s) − AN HN V N (s) ds + rN 0 + rN HN V N (t) − HN V N (0) , where M N,1 , M N,2 are martingales, V0 is the deterministic limit of the process V0N , F is its infinitesimal drift and HN is an approximate solution to a Poisson equation. Our conditions insure each individual term has a well behaved limit. N di We assume that ViN takes values in EN i ⊂ R , i = 0, 1, 2, and that Ei converges in the sense that there exists Ei ⊂ Rdi such that EN i ⊂ Ei and for each d i compact K ⊂ R , lim sup inf |x − y| = 0. N→∞ x∈Ei ∩K y∈EN i We will refer to AN as the “generator” for the process V N = (V0N , V1N , V2N ), but all we require is that AN is a linear operator on some space D(AN ) of measurN N able functions on EN ≡ EN 0 × E1 × E2 and that for h ∈ D (AN ), h V N (t) − h V N (0) − t 0 AN h V N (s) ds is a local martingale. We first identify the time scales of the process V N with two sequences of positive numbers {r1,N }, {r2,N }, and introduce a sequence of scaling parameters {rN } for U N with the following properties. C ONDITION 2.1 (Scaling parameters). The scaling parameters rN → ∞ and {r1,N }, {r2,N } are sequences of positive numbers satisfying rN = 0, N→∞ r1,N r1,N lim = 0. N→∞ r2,N lim (2.2) We next identify the “generators” for the effective dynamics of V0N , V1N and V2N on time scales t, tr1,N , and tr2,N , respectively. L0 , L1 , L2 will be linear operators defined on sufficiently large domains, D(L0 ) ⊂ M(E0 ), D(L1 ) ⊂ M(E0 × E1 ) and D(L2 ) ⊂ M(E0 × E1 × E2 ), and taking values in M(E0 × E1 × E2 ). The requirements that determine what is meant by “sufficiently large” will become clear, but we will assume that the domains contain all C ∞ functions having compact support in the appropriate space. We will use the notation E = E0 × E1 × E2 . 725 CLT FOR MULTI-SCALE MARKOV CHAINS C ONDITION 2.2 (Multiscale convergence). Rd0 +d1 +d2 , lim For each compact K ⊂ sup AN h(v) − L0 h(v) = 0, N→∞ v∈K∩EN h ∈ D(L0 ), 1 sup AN h(v) − L1 h(v) = 0, N→∞ v∈K∩EN r1,N h ∈ D(L1 ) lim and 1 sup AN h(v) − L2 h(v) = 0, h ∈ D(L2 ). r lim N→∞ v∈K∩EN 2,N R EMARK 2.3. Similar conditions are considered in Ethier and Nagylaki (1980). See also Ethier and Kurtz (1986), Section 1.7. There may be only two time-scales, in which case d2 = 0, L2 h = 0 and E = E0 × E1 (equivalently, E2 consists of a single point) in what follows. The next condition ensures the uniqueness of the conditional equilibrium distributions of the fast components V2N and V1N , whose “generators” are L2 and L1 . For each (v0 , v1 ) ∈ E0 ×E1 , there exC ONDITION 2.4 (Averaging condition). ists a unique μv0 ,v1 ∈ P (E2 ) such that L2 h(v0 , v1 , v2 )μv0 ,v1 (dv2 ) = 0 for every h ∈ D (L2 ) ∩ B(E). For each v0 ∈ E0 , there exists a unique μv0 ∈ P (E1 ) such that L1 h(v0 , v1 , v2 )μv0 ,v1 (dv2 )μv0 (dv1 ) = 0 for every h ∈ D(L1 ) ∩ B(E0 × E1 ). With this condition in mind, we define L1 h(v0 , v1 ) = L1 h(v0 , v1 , v2 )μv0 ,v1 (dv2 ). Our first convergence condition insures that the slow component V0N has a deterministic limit. Essentially it implies that its “generator” L0 h = F · ∇h, for h ∈ Cc∞ (E0 ). It also identifies the intrinsic fluctuations of the slow component via a martingale M N,1 . For an Rd0 -valued process Y , we use [Y ]t to denote the matrix of covariations [Yi , Yj ]t . C ONDITION 2.5 (First convergence condition). and F, G0 ∈ C(E, Rd0 ) such that Rd0 ) (2.3) M N,1 (t) = V0N (t) − V0N (0) − is a local martingale, (2.4) lim [V0N ]t t There exist F N ∈ M(EN , F N V N (s) ds 0 ⇒ 0, and for each compact K ⊂ E, sup rN F N (v) − F (v) − G0 (v) = 0. N→∞ v∈K∩EN 726 H.-W. KANG, T. G. KURTZ AND L. POPOVIC We next turn to the relevant Poisson equations based on the conditional equilibrium distributions of the fast components and the limiting drift of the slow component. Suppose that there exist h1 ∈ D(L1 )d0 and h2 , h3 ∈ D(L2 )d0 such that L1 h1 (v0 , v1 ) = F (v0 , v1 , v2 )μv0 ,v1 (dv2 ) − F (v0 , v1 , v2 )μv0 ,v1 (dv2 )μv0 (dv1 ), (2.5) L2 h2 (v0 , v1 , v2 ) = F (v0 , v1 , v2 ) − F (v0 , v1 , v2 )μv0 ,v1 (dv2 ), L2 h3 (v0 , v1 , v2 ) = L1 h1 (v0 , v1 ) − L1 h1 (v0 , v1 , v2 ). Define F 1 (v0 , v1 ) = F (v0 ) = F (v0 , v1 , v2 )μv0 ,v1 (dv2 ), F (v0 , v1 , v2 )μv0 ,v1 (dv2 )μv0 (dv1 ) and HN = (2.6) 1 r1,N h1 + 1 r2,N (h2 + h3 ). Note that for HN of this form AN HN ≈ L1 h1 + L2 (h2 + h3 ) = F − F . In what follows, HN does not have to be given by (2.6). That form simply suggests the possibility of finding HN with the desired properties. Specifically, we assume the existence of HN ∈ D(AN ) satisfying the following convergence condition. C ONDITION 2.6 (Second convergence condition). G1 ∈ C(E, Rd0 ) such that for each compact K ⊂ E, (2.7) lim Assume that there exists sup rN F (v) − F (v0 ) − AN HN (v) − G1 (v) = 0. N→∞ v∈K∩EN R EMARK 2.7. The critical requirements for HN are (2.7), (2.10) and (2.11). In fact, because of the possibility of large fluctuations by V1N and V2N , even if h1 , h2 and h3 satisfying Condition 2.5 can be found, it may be necessary to define HN using a sequence of truncations of h1 , h2 and h3 . This now identifies the fluctuations of the slow component due to convergence of the fast components to their conditional equilibrium distributions via a martingale M N,2 . For V0 (0) ∈ Rd0 , let V0 satisfy (2.8) V0 (t) = V0 (0) + t 0 F V0 (s) ds, 727 CLT FOR MULTI-SCALE MARKOV CHAINS and define M N,2 (t) = HN V N (t) − HN V N (0) − t 0 AN HN V N (s) ds. The following condition is then needed for application of the martingale central limit theorem, Theorem A.1, to the process M N,1 − M N,2 , composed of M N,2 above and M N,1 from (2.3). Essentially it says that the jumps of both the slow component and solutions to the Poisson equations are appropriately small, and that the quadratic variation of M N,1 − M N,2 converges. There exists G ∈ C(E, C ONDITION 2.8 (Converegence of covariation). Md0 ×d0 ) such that for each t > 0, (2.9) lim E sup rN V0N (s) − V0N (s−) = 0, N→∞ s≤t sup rN HN V N (s) ⇒ 0 (2.10) s≤t and (2.11) (rN )2 V0N − HN ◦ V N t − t 0 G V N (s) ds ⇒ 0. We can now account for all the terms in the expansion (2.1) of U N = rN (V0N − V0 ), U N (t) = U N (0) + rN M N,1 (t) − M N,2 (t) t N + rN F V0 (s) − F V0 (s) ds 0 + rN t 0 F N V N (s) − F V N (s) ds t N F V (s) − F V0N (s) − AN HN V N (s) ds + rN 0 + rN HN V N (t) − HN V N (0) . Conditions 2.1–2.8 insure that all the terms will have a limit as N → ∞. The limit of the second term on the right is guaranteed by (2.9), (2.10) and (2.11) in Condi2.8. Assuming that F is smooth, the third term on the right is asymptotic to tion t ∇F (V0 (s)) · U N (s) ds. The fourth term is controlled by (2.4) of Condition 2.5, 0 the fifth by (2.7) of Condition 2.6. The order of time scale parameters (2.2) from Condition 2.1 and the form of HN in (2.6) suggests that the sixth term goes to zero, but we will explicitly assume that in the statement of the theorem. Finally, we now only need a condition to ensure relative compactness of the sequence. If E is unbounded, let ψ : E → [1, ∞) be locally bounded and satisfy 728 H.-W. KANG, T. G. KURTZ AND L. POPOVIC limv→∞ ψ(v) = ∞, or let ψ : E → [1, ∞) be such that ∀M < ∞, {v ∈ E : ψ(v) ≤ M} is relatively compact in E, and let Dψ denote the collection of continuous functions f satisfying |f (v)| < ∞, v∈E ψ(v) |f (v)| = 0. k→∞ v∈E,|v|>k ψ(v) lim sup sup For sequences of space–time random measures, the notion of convergence that we will use is that discussed in Kurtz (1992). L EMMA 2.9. Let V N be a sequence of E-valued processes, and define the occupation measure N D × [0, t] = (2.12) Suppose that for each t > 0 (2.13) t sup E 0 N t 0 1D V N (s) ds. ψ V N (s) ds < ∞. Then {N } is relatvely compact, and if N ⇒ , then for f1 , . . . , fm ∈ Dψ , · 0 f1 V N (s) ds, . . . , ⇒ E · 0 fm V N (s) ds f1 (v) dv × [0, ·] , . . . , E fm (v) dv × [0, ·] in CRm [0, ∞). P ROOF. Relative compactness of {N } follows from Lemma 1.3 of Kurtz (1992). Relative compactness in CRm [0, ∞) follows from relative compactness of each component. To see that for f ∈ Dψ , the sequence X N = 0· f (V N (s)) ds is relatively compact, it is enough to approximate the sequence by sequences known to be relatively compact. For ε > 0, there exists a compact Kε ⊂ E and C > 0, such that |f | ≤ (C1Kε + ε)ψ. Define XεN = 0· 1Kε (V N (s))f (V N (s)) ds. Note that XεN is Lipschitz with Lipschitz constant supv∈Kε |f (v)|, so {XεN } is relatively compact. For δ > 0, t ε ψ V N (s) ds , sup P supX N (s) − XεN (s) ≥ δ ≤ sup E δ N 0 s≤t N and relative compactness of {XN } follows; see Problem 3.11.18 of Ethier and Kurtz (1986). Assuming that N ⇒ , the convergence of 0· f (V N (s)) ds to E×[0,·] f (v) × (dv × ds) follows by the same type of approximation. The final condition insures relative compactness of V N . 729 CLT FOR MULTI-SCALE MARKOV CHAINS C ONDITION 2.10 (Tightness). If E is unbounded, there exists a locally bounded ψ : E → [1, ∞) satisfying limv→∞ ψ(v) = ∞ such that for each t > 0, t sup E (2.14) 0 N N ψ V (s) ds < ∞ and all of the following functions are in Dψ : supN |F N |, supN |rN (F N − F )|, supN |rN (F − F − AN HN )|, |G|, supN |AN h| for h ∈ D(L0 ) ∩ B(E0 ), 1 1 AN h| for h ∈ D(L1 ) ∩ B(E0 × E1 ), and supN | r2,N AN h| for h ∈ supN | r1,N D(L2 ) ∩ B(E). Assuming the above conditions and defining (2.15) G(v0 ) = G(v0 , v1 , v2 )μv0 ,v1 (dv2 )μv0 (dv1 ), and similarly for G0 and G1 , we have the following functional central limit theorem. T HEOREM 2.11. Under the above conditions, suppose that N limN→∞ U (0) = U (0), that F is continuously differentiable and that the solution (necessarily unique) of (2.8) exists for all time. Then for each t > 0, supV0N (s) − V0 (s) ⇒ 0, s≤t rN (M N,1 − M N,2 ) ⇒ M, where M has Gaussian, mean-zero, independent increments with E M(t)M (t) = (2.16) T t 0 G V0 (s) ds, and U N ⇒ U satisfying U (t) = U (0) + M(t) + Assuming G = σσT , + 0 ∇F V0 (s) U (s) + G0 V0 (s) + G1 V0 (s) ds. we can write U (t) = U (0) + (2.17) t t 0 t 0 σ V0 (s) dW (s) ∇F V0 (s) U (s) + G0 V0 (s) + G1 V0 (s) ds. R EMARK 2.12. As noted above, the corresponding theorem for systems with two time-scales is obtained by assuming E2 consists of a single point so L2 f ≡ 0. 730 H.-W. KANG, T. G. KURTZ AND L. POPOVIC P ROOF OF T HEOREM 2.11. Let N be the occupation measure defined as in (2.12). Then by Lemma 2.9, {N } is relatively compact. Assume, for simplicity that N ⇒ . We will show that is uniquely determined. Condition 2.5, equation (2.9) and the martingale central limit theorem, Theorem A.1, imply M N,1 ⇒ 0, and Lemma 2.9 then implies V0N ⇒ V0∞ , where V0∞ (t) = V0 (0) + (2.18) E×[0,t] F (v)(dv × ds). Condition 2.10, the definition of L2 , and Lemma 2.9 imply t N N N h V (t) − h V (0) − AN h V (s) ds r 2,N 0 1 ⇒ E×[0,t] L2 h(v)(dv × ds) for every h ∈ Cc∞ (E). The uniform integrability implied by (2.14) implies that the limit is a continuous martingale with sample paths of finite variation and hence is identically zero. Condition 2.4 then implies [see Example 2.3 of Kurtz (1992)] that can be written (dv × ds) = μv0 ,v1 (dv2 ) 0,1 (dv0 × dv1 × ds). A similar argument gives 0= = E×[0,t] L1 h(v)(dv × ds) E0 ×E1 ×[0,t] L1 h(v0 , v1 ) 0,1 (dv0 × dv1 × ds), which implies 0,1 (dv0 × dv1 × ds) = μv0 (dv1 ) 0 (dv0 × ds). But the convergence of V0N to V0∞ implies 0 (dv0 × ds) = δV0∞ (s) (dv0 ) ds. Now (2.18) can be rewritten V0∞ (t) = V0 (0) + (2.19) t 0 F V0∞ (s) ds, V0∞ = V0 . and it follows that Similarly, (2.11) now becomes 2 (rN ) V0N (M N,1 − HN ◦ V N t ⇒ − M N,2 ) ⇒ M t 0 G V0 (s) ds, as desired. and it follows that rN Finally, the uniform integrability implied by (2.14) and Condition 2.10 allows interchange of limits and integrals in the expansion of U N given in (2.1), and the convergence of U N to U follows. 731 CLT FOR MULTI-SCALE MARKOV CHAINS 3. Diffusion approximation. The functional central limit theorem, Theorem 2.11, suggests approximating V0N by V0 + r1N U . In turn, that observation and (2.17) suggest approximating V0N by a diffusion process given by the Itô equation 1 t N σ D (s) dW (s) D N (t) = V0N (0) + rN 0 (3.1) t 1 1 + F D N (s) + G0 D N (s) + G1 D N (s) ds. rN rN 0 The approximation N ≡ V0 + V0N ≈ D 1 U rN is, of course, justified by Theorem 2.11. Justification for the approximation V0N ≈ D N is less clear, since D N is not produced as a limit. Noting, however, that N (t) = V N (0) + D 0 + t 0 1 rN t 0 σ V0 (s) dW (s) F V0 (s) + 1 1 ∇F V0 (s) U (s) + G0 V0 (s) rN rN 1 + G1 V0 (s) ds, rN N ) converges to assuming smoothness of F , G0 and G1 , we see that rN2 (D N − D satisfying U (t) = U t 0 + ∇σ V0 (s) U (s) dW (s) t 0 (s) + 1 U T (s)∂ 2 F V0 (s) U (s) ∇F V0 (s) U 2 + ∇G0 V0 (s) + ∇G1 V0 (s) U (s) ds, and since the central limit theorem demonstrates that the fluctuations of V N are of order O(rN−1 ), we see that the difference between the two approximations D N and N is negligible compared to these fluctuations. D 4. Markov chain models for chemical reactions. A reaction network is a chemical system involving multiple reactions and chemical species. The kind of stochastic model for a network that we will consider treats the system as a continuous time Markov chain whose state X is a vector giving the number of molecules Xi of each species of type i ∈ I present. Each reaction is modeled as a possible 732 H.-W. KANG, T. G. KURTZ AND L. POPOVIC transition for the state. The model for the kth reaction, for each k ∈ K, is determined by a vector of inputs νk specifying the numbers of molecules of each chemical species that are consumed in the reaction, a vector of outputs νk specifying the numbers of molecules of each species that are produced in the reaction, and a function of the state λk (x) that gives the rate at which the reaction occurs as a function of the state. Specifically, if the kth reaction occurs at time t, the change in X is a vector of integer values ζk = νk − νk . Let Rk (t) denote the number of times that the kth reaction occurs by time t. Then Rk is a counting process with intensity λk (X(t)) (called the propensity in the chemical literature) and can be written as Rk (t) = Yk t 0 λk X(s) ds , where the Yk are independent unit Poisson processes. The state of the system at time t can be written as X(t) = X(0) + ζk Rk (t) = X(0) + k k t ζk Yk 0 λk X(s) ds . In the stochastic version of the law of mass action, the rate function is proportional to the number of ways of selecting the molecules that are consumed in the reaction, that is, λk (x) = κk νik ! i xi i νik = κk xi (xi − 1) · · · (xi − νik + 1). i Of course, physically, |νk | = i νik is usually assumed to be less than or equal to two, but that does not play a significant role in the analysis that follows. A reaction network may exhibit behavior on multiple scales due to the fact that some species may be present in much greater abundance than others, and the rate functions may vary over several orders of magnitude. Following Kang and Kurtz (2013), we embed the model of interest in a sequence of models indexed by a scaling parameter N . The model of interest corresponds to a particular value of the scaling parameter N0 . For each species i ∈ I = {1, . . . , s}, we specify a parameter αi ≥ 0 and normalize the number of molecules by N0αi defining N Zi 0 (t) = N0−αi Xi (t) so that it is of O(1). For each reaction k ∈ K, we specify β another parameter βk and normalize the reaction rate constant as κk = κk N0 k so that κk is of O(1). One can observe this model on different time scales as well, by γ replacing t by tN0 , for some γ ∈ R. The model then becomes a Markov chain on EN0 = N0−α1 Z+ × · · · × N0−αs Z+ which, when N = N0 , evolves according to ZiN (t) = ZiN (0) + k N −αi t ζik Yk 0 N N νk ·α+βk +γ λN k Z (s) ds 733 CLT FOR MULTI-SCALE MARKOV CHAINS with λN k (z) = κk zi zi − N −αi · · · zi − (νik − 1)N −αi . i If for some i, αi > 0 and νik > 1, then λN k varies with N but converges as N → ∞. To simplify notation, we will write λk (z) rather than λN k , but one should check that the N -dependence is indeed negligible in the analysis that we do. Defining N = diag(N −α1 , . . . , N −αs ), so Z N = N X, let AN f (z) = N ρk λk (z) f (z + N ζk ) − f (z) , k where ρk = νk · α + βk + γ . Since the change of time variable from t to tN γ is equivalent to scaling the generator by a factor of N γ , we initially take γ to be zero. We subsequently consider the behaviour of Z N on different time-scales Z N (·N γ ). To be precise regarding the domain of AN , note that because the jumps of Z N are uniformly bounded, if we define τrN = inf{t : |Z N (t)| ≥ r}, then for every continuous function f , f Z N t ∧ τrN − f Z N (0) − t∧τ N r 0 AN f Z N (s) ds is a martingale. For notational simplicity, assume that the αi satisfy 0 ≤ α1 ≤ · · · ≤ αs , and let d◦ ≥ 0 satisfy αi = 0, i ≤ d◦ and αi > 0, i > d◦ . To apply the results of Section 2, we identify rN , r1,N , r2,N from the reaction network and the parameters {αi }, {βk } as follows. Let m2 = max{ρk − αi : ζik = 0}, and define r2,N = N m2 . Then there exists a linear operator L2 such that for each compact K ⊂ Rs , lim 1 sup AN h(z) − L2 h(z) = 0, r N→∞ z∈K∩EN 2,N h ∈ D(L2 ) = C 1 Rs . Depending on the relationship between ρk and αi for ζik = 0 and the time-scale parameter γ , the limiting operator L2 is either the generator for a Markov chain, a differential operator, or a combination of the two, which would be the generator for a piecewise deterministic Markov process (PDMP). We classify the reactions by defining K2,◦ = {k ∈ K : ρk = m2 } and K2,• = {k ∈ K : ρk − αi = m2 for some i with αi > 0, ζik = 0}. 734 H.-W. KANG, T. G. KURTZ AND L. POPOVIC For each k ∈ K2,◦ ∪ K2,• define ζ2,k = lim N ρk −m2 N ζk ∈ Z. N→∞ Note that throughout the paper ζ2,k will denote the limiting reaction vector, not to be confused with the single matrix entry ζik . Then, for h ∈ C 1 (Rs ) (4.1) L2 h(z) = λk (z) h(z + ζ2,k ) − h(z) + k∈K2,◦ λk (z)∇h(z) · ζ2,k . k∈K2,• Note that, although λk (z) depends on all species types, the dynamics defined by L2 makes changes only due to reactions K2,◦ ∪ K2,• . In other words, only the subnetwork defined by reactions K2,◦ ∪ K2,• is relevant on the time-scale corresponding to γ = −m2 . If K2,• is empty, the process corresponding to L2 is a Markov chain, and if K2,◦ is empty, the process is just the solution of an ordinary differential equation. If both are nonempty, the process is piecewise deterministic in the sense of Davis (1993). The process corresponding to L2 can be obtained as the solution of V2 (t) = V2 (0) + t ζ2,k Yk k∈K2,◦ 0 λk V2 (s) ds + t ζ2,k k∈K2,• 0 λk V2 (s) ds, and assuming that V2 does not hit infinity in finite time, Z N (·N −m2 ) ⇒ V2 . The central limit theorem in Section 2 assumes that the state space is a product space and that the fast process “averages out” one component. The state space on which functions in the domain of L1 in Condition 2.2 are defined is such that every function on it is contained in the kernal of L2 . In order to separate the state space in this way, we need to identify the combinations of species variables whose change on the fastest time-scale γ = −m2 is less than O(1). This can be done with a change of basis of the original state space as follows. Let SK be a matrix whose columns are ζk , k ∈ K for some subset K ⊂ K. Then SK is the stoichiometric matrix associated with the reaction subnetwork K. For the species types whose behavior is discrete, SK gives the possible jumps, while for the species whose behavior evolves continuously, SK determines the possible paths. We will let R(SK ) = span{ζk , k ∈ K} ⊂ Rs denote the range of SK , called the stoichiometric subspace of the chemical reaction subnetwork K, and we will let T s N SK = θ ∈ R : θi ζik = 0 ∀k ∈ K i∈I denote the null space of which is the othogonal complement of R(SK ). For each initial value z0 of the reaction system, z0 + R(SK ) defines the stoichiometric compatibility class of the system. Then both stochastically and deterministically evolving components of the system must remain in the stoichiometric compatibility class for all time t > 0. The linear combinations of the species θ · X for T SK 735 CLT FOR MULTI-SCALE MARKOV CHAINS T ) are conserved quantities; that is, they are constant along the trajectoθ ∈ N (SK ries of the evolution of the reaction subnetwork K. On the time scale γ = −m2 , the fast subnetwork determined by L2 has the stoichiometric matrix S2 whose columns are {ζ2,k , k ∈ K2,◦ ∪ K2,• }. Define N (S2T ) as above, and note that θ · V2 , θ ∈ N (S2T ), are conserved quantities for the fast subnetwork, that is, θ · V2 (t) does not depend on t. Let s2 denote the dimension of R(S2 ), and s1 = s − s2 be the dimension of N (S2T ). We now replace the natural state space of the process by N (S2T ) × R(S2 ), mapping the original processes onto this product space by the orthogonal projection N (S T ) × R(S2 ) , that is, 2 N V1 (t), V2N (t) = N (S T ) Z N (t), R(S2 ) Z N (t) . 2 Note that the original coordinates have different underlying state spaces N −αi Z; however, the change of basis will combine only those coordinates with the same scaling parameter αi . To see that this is the case, note that by the definition of ζ2,k , ζ2,ik = 0 and ζ2,j k = 0 implies αi = αj . It follows that there is a basis θ1 , . . . , θs1 for N (S2T ) such that θil = 0 and θj l = 0 implies αi = αj , and we can take this basis to be orthonormal. We denote the common scaling parameter by αθl . Let 1 be the matrix with rows θ1T , . . . , θsT so that (1 z)T = (θ1 · z, . . . , θs1 · z)T and the 1 orthogonal projection is given by N (S T ) = T1 1 = 2 s1 θl θlT . l=1 On the next time scale we only need to consider the dynamics of the projection of the original process that is unaffected by the fast subnetwork V1N = N (S T ) N X. Since N (S T ) N = N N (S T ) , we have 2 2 2 V1N (t) = N (S T ) Z N (0) + N 2 k N (S T ) ζk Yk N ρk 2 t 0 N λN k Z (s) ds . Note that R(S2 ) ζk is not necessarily equal to ζ2,k , nor is the other projection N (S T ) ζk = ζk − R(S2 ) ζk necessarily equal to ζk − ζ2,k . To identify the next time 2 scale let m1 = max{ρk − αθl : θl · ζk = 0} = max ρk − αi : (N (S T ) ζk )i = 0 , 2 and define r1,N = N m1 . Note that m1 < m2 . Then there exists a linear operator L1 such that for each compact K ⊂ Rs1 , (4.2) 1 sup AN h(z) − L1 h(z) = 0, N→∞ z∈K∩EN r1,N lim 736 H.-W. KANG, T. G. KURTZ AND L. POPOVIC where h ∈ D(L1 ) satisfies h(z) = f (θ1 · z, . . . , θs1 · z) for f ∈ C 1 (Rs1 ). Define K1,◦ = k ∈ K : ρk = m1 , max |θl · ζk | > 0 l and K1,• = {k ∈ K : ρk − αθl = m1 for some l with αθl > 0, θl · ζk = 0}. −αθ1 1 ,...,N Let N = diag(N −αθ s1 ), and for each k ∈ K1,◦ ∪ K1,• define θ ρk −m1 1 = lim N ρk −m1 N −αθ1 θ1 ·ζk , . . . , N ζ1,k N 1 ζk = lim N N→∞ N→∞ −αθ s1 θs1 ·ζk T . Then for h(z) = f (1 z) with f ∈ C 1 (Rs1 ) L1 h(z) = θ λk (z) f 1 z + ζ1,k − f (1 z) + k∈K1,◦ θ λk (z)∇f (1 z) · ζ1,k . k∈K1,• If V1 denotes the process corresponding to L1 , then assuming that V1 does not hit infinity in finite time, V1N (·N −m1 ) = N (S T ) Z N (·N −m1 ) ⇒ V1 . 2 To separate the state space in terms of the next time scale (if there is one), define ζ1,k = lim N ρk −m1 N N (S T ) ζk . N→∞ 2 θ is embedded in the original space, and ζ θ = ζ 1 . In other words, ζ1,k = T1 ζ1,k 1 k 1,k On the time scale γ = −m1 , the subnetwork determined by L1 has the stoichiometric matrix S1 with columns {ζ1,k k ∈ K1,◦ ∪ K1,• }. Define the subspace N (S1T ) as before, and let s1 denote the dimension of R(S1 ) and s0 = s1 − s1 be the dimension of N (S1T ). As before we need to map the processes V1N onto this product space by the orthogonal projection N (S T ) × R(S1 ) . Since ζ1,k ∈ N (S2T ) = 1 span(θ1 , . . . , θs1 ), we can assume that the θl are selected so that R(S1 ) = span(θs0 +1 , . . . , θs1 ) = span(ζ1,1 , . . . , ζ1,s1 ). Define 0 = s0 l=1 θl θlT = N (S T ) , 1 1 = s1 l=s0 +1 θl θlT = R(S1 ) and 2 = R(S2 ) . On the next time scale we need only consider the projection 0 Z N of the original process which is unaffected by either of the faster subnetworks. To identify the next time scale, let m0 = max{ρk − αθl : θl · ζk = 0, 1 ≤ l ≤ s0 } = max ρk − αi : (0 ζk )i = 0 , and define r0,N = N m0 . Note that if 1 ≤ s2 , 1 ≤ s1 , 1 ≤ s0 (s0 + s1 + s2 = s), m0 < m1 < m2 . Without loss of generality, we can assume that time is scaled so that 737 CLT FOR MULTI-SCALE MARKOV CHAINS m0 = 0. Then, there exists a linear operator L0 such that for each compact K ⊂ R s0 , sup AN h(z) − L0 h(z) = 0, lim N→∞ z∈K∩EN where h ∈ D(L0 ) satisfies h(z) = f (θ1 · z, . . . , θs0 · z) for f ∈ C 1 (Rs0 ). Define K0,◦ = k ∈ K : ρk = 0, max |θl · ζk | > 0 l and K0,• = {k ∈ K : ρk − αθl = 0 for some l with αθl > 0, θl · ζk = 0, 1 ≤ l ≤ s0 }. As before, let 0 be the matrix with rows θ1T , . . . , θsT0 , and let N 0 = diag(N −αθ1 , −α . . . , N θs0 ), so that 0 = N (S T ) = T0 0 and for each k ∈ K0,◦ ∪ K0,• define 1 ζkθ,0 = lim N ρk N 0 0 ζk = lim N ρk N −αθ1 θ1 · ζk , . . . , N N→∞ N→∞ For h(z) = f (0 z) with f ∈ C 1 (Rs0 ) L0 h(z) = k∈K0,◦ λk (z) f 0 z + ζkθ,0 − f (0 z) + k∈K0,• −αθs 0 θ s0 · ζ k T . λk (z)∇f (0 z) · ζkθ,0 . To relate the above calculations to the results of Section 2, we assume that K0,◦ = ∅ so that L0 h(z) = k∈K0,• λk (z)∇f (0 z) · ζkθ,0 . Let V N = T Z N ≡ (0 Z N , 1 Z N , 2 Z N ), so V N = (V0N , V1N , V2N ) ∈ N (S1T ) × R(S1 ) × R(S2 ), and note that T is invertible so that the intensities can be written as functions of v ∈ N (S1T ) × R(S1 ) × R(S2 ), that is, λk (T −1 v). Since s s 0 1 0 z = l=1 (θl · z)θl and 1 z = l=s (θ · z)θl , the process V0N = 0 Z N is 0 +1 l N the embedding of 0 Z N , and similarly (V0N , V1N ) = (0 Z N , N 1 Z ) is just the N embedding of 1 Z . Let E0 , E1 , and E2 denote the limit of the state spaces for V0N , V1N and V2N . The function F N in (2.3) is given by (4.3) F N (v) = N ρk N 0 λk T −1 v 0 ζk k and F (v) = lim F N (v) = N→∞ k∈K0,• λk T −1 v ζkθ,0 . 738 H.-W. KANG, T. G. KURTZ AND L. POPOVIC To satisfy Condition 2.4 we will assume that L2 is such that for each (v0 , v1 ) ∈ E0 × E1 there exists a unique conditional equilibrium distribution μv0 ,v1 (dv2 ) ∈ P (E2 ) for L2 . Then L1 h(v0 , v1 ) = L1 h(v0 , v1 , u2 )μv0 ,v1 (du2 ) is L1 h(v0 , v1 ) = θ λk (v0 , v1 ) f (v0 , v1 ) + ζ1,k − f (v0 , v1 ) k∈K1,◦ (4.4) + θ λk (v0 , v1 )∇f (v0 , v1 ) · ζ1,k , k∈K1,• where λk (v0 , v1 ) = λk (T −1 (v0 , v1 , v2 ))μv0 ,v1 (dv2 ). For Condition 2.4 to be met, we also need to assume that for each v0 ∈ E0 there exists a unique conditional equilibrium distribution μv0 (dv1 ) ∈ P (E1 ) for L1 . We further need to assume that there are functions h1 ∈ D(L1 ) : E0 × E1 → R|E0 | and h2 , h3 ∈ D(L2 ) : E → R|E0 | that solve the following Poisson equations: L1 h1 = F 1 − F , where F 1 (v0 , v1 ) = L2 h2 = F − F 1 , L2 h3 = L1 h1 − L1 h1 , F (v0 ) = F (v0 , v1 , u2 )μv0 ,v1 (du2 ), F 1 (v0 , u1 )μv0 (du1 ) in order for Condition 2.6 to be met. We refer the reader to Glynn and Meyn (1996) for results on sufficient conditions for the existence of solutions to a Poisson equation for a general class of Markov processes. For the class of general piecewise deterministic processes see also Costa and Dufour (2003). For the examples considered in Section 5, we were able to explicitly compute the desired functions. In general, however, explicit computation may not be possible, so results that ensure the existence of these functions may be useful. We now need to identify rN , which will be of the form rN = N p , for some 0 < p < m1 . Assuming that there is no cancellation among the terms in the sum in (4.3), for (2.4) to hold, we must have (4.5) p ≤ max{αθl − ρk : θl · ζk = 0, ρk < αθl , 1 ≤ l ≤ s0 }. Then θl · G0 (v) = lim rN θl · F N (v) − F (v) = N→∞ λk T −1 v θl · ζk k : αθl −ρk =p and G0 (v) = s0 λk T −1 v θl · ζk θl . l=1 k:αθl −ρk =p −1 −1 h1 + r2,N (h2 + h3 ). To ensure that the limit in (2.7) exists, Now let H N = r1,N with reference to the definition of L2 , we must have (4.6) p ≤ min{αi + m2 − ρk : ζik = 0, αi + m2 − ρk > 0} 739 CLT FOR MULTI-SCALE MARKOV CHAINS and p ≤ min{2αi + m2 − ρk : ζik = 0, αi > 0}, (4.7) and with reference to the definition of L1 , we must have p ≤ min{αθl + m1 − ρk : θl · ζk = 0, αθl + m1 − ρk > 0} (4.8) and p ≤ min{2αθl + m1 − ρk : θl · ζk = 0, αθl > 0}. (4.9) Note that (4.5) implies the minimum in (4.8) and (4.9) only needs to be taken over s0 + 1 ≤ l ≤ s1 . Assuming that h1 , h2 and h3 are sufficiently smooth, these assumptions insure that there exists G1 : E → R|E0 | G1 (v) = lim N→∞ rN AN AN − L2 (h2 + h3 ) + rN − L1 h1 r2,N r1,N = G12 (v) + G11 (v). To identify G12 , define ζ2,k = lim N p N ρk −m2 N ζk − ζ2,k , N→∞ ξ2,kij = lim N p+ρk −m2 −αi −αj ζik ζkj N→∞ and p K2,◦ = {k ∈ K : θl · ζk = 0 for some l with αθl = 0, m2 − ρk = p}, p K2,• = {k ∈ K \ K2,• : θl · ζk = 0 for some l with αθl > 0, m2 − ρk + αθl = p}. Then setting h(z) = h2 (T z) + h3 (T z), G12 (v) = H12 (T −1 v), where H12 (z) = λk (z)∇h(z + ζ2,k ) · ζ2,k + k∈K2,◦ + λk (z)∇h(z) · ζ2,k p k∈K2,• ∪K2,• λk (z) k∈K2,• 1 ∂zi ∂zj h(z) ξ2,kij + λk (z) h(z + ζ2,k ) − h(z) . 2 ij p k∈K2,◦ Similarly, to identify G11 , define θ θ 1 ζ1,k = lim N p N ρk −m1 N 1 ζk − ζ1,k , N→∞ θ ξ1,kll = lim N N→∞ p+ρk −m1 −αθl −αθ θ θ l ζ 1,kl ζ1,kl 740 H.-W. KANG, T. G. KURTZ AND L. POPOVIC and p K1,◦ = {k ∈ K : θl · ζk = 0 for some l with αθl = 0, m1 − ρk = p}, p K1,• = k ∈ K \ K•1 : θl · ζk = 0 for some l with αθl > 0, m1 − ρk + αθl = p Then G11 (v) = H11 (T −1 v), where H11 (z) = k∈K1,◦ + θ θ λk (z)∇h1 1 z + ζ1,k · ζ1,k θ λk (z)∇h1 (1 z) · ζ1,k p k∈K1,• ∪K1,• + λk (z) k∈K1,• + 1 θ ∂l ∂l h1 (1 z) ξ1,kll 2 ij θ λk (z) h1 1 z + ζ1,k − h1 (1 z) . p k∈K1,◦ We now need to identify G : E → M|E0 |×|E0 | satisfying (2.11) in Condition 2.8. Let RkN (t) = Yk and N (V N ) = 0 Z N H N V N N 2p H t = N 2p k t − HN (V N ) = V0N t 0 N ρk 0 λk Z N (s) ds − HN (V N ). Then denoting z⊗2 = zzT , ⊗2 N V N (s−) + T N ζk − H N V N (s−) H dRkN (s), which is asymptotic to k N 2p+ρk t 0 ⊗2 N V N (s−) + T N ζk − H N V N (s−) H λk Z N (s) ds. Taking the limit as N → ∞ and integrating with respect to μv0 ,v1 (dv2 ) and μv0 (dv1 ) then gives the value of G. 5. Examples. We now apply the central limit theorem to several examples of chemical reaction networks with multiple scales. 5.1. Three species viral model. Ball et al. (2006) considered asymptotics for a model of an intracellular viral infection originally given in Srivastava et al. (2002) and studied further in Haseltine and Rawlings (2002). The model includes three time-varying species, the viral template, the viral genome and the viral structural 741 CLT FOR MULTI-SCALE MARKOV CHAINS protein, involved in six reactions, κ1 T + G, T + stuff G T, (3) (4) T + stuff T κ3 κ4 T + S, ∅, (5) S G+S κ6 (1) (2) (6) κ2 κ5 ∅, V, whose reaction rates (propensities) are of mass-action kinetics form λk (x) = κk i xiνki with constants κ1 κ2 κ3 κ4 κ5 κ6 1 0.025 1000 0.25 2 7.5 × 10−6 1, −2/3 2.5N0 , N0 , 0.25, 2, −5/3 0.75N0 , here expressed in terms of N0 = 1000. We denote T , G, S as species 1, 2 and 3, respectively, and let Xi (t) denote the number of molecules of species i in the system at time t. The stochastic model is X1 (t) = X1 (0) + Y2 − Y4 t 0 − Y6 t 0 t 0 − Y6 t 0 t 0 0.025X2 (s) ds t 0 X1 (s) ds 0.025X2 (s) ds −6 7.5 · 10 X3 (t) = X3 (0) + Y3 − Y5 0 0.25X1 (s) ds , X2 (t) = X2 (0) + Y1 − Y2 t X2 (s)X3 (s) ds , t 0 1000X1 (s) ds 2X3 (s) ds 7.5 · 10−6 X2 (s)X3 (s) ds . 742 H.-W. KANG, T. G. KURTZ AND L. POPOVIC We take α1 = 0, α2 = 2/3, α3 = 1. The scaling of the rate constants gives βk ρk k κk 1 1 0 0 2 2.5 −2/3 0 3 1 1 1 4 0.25 0 0 5 2 0 1 6 0.75 −5/3 0. Changing time t → N 2/3 t, the normalized system becomes Z1N (t) = Z1N (0) + Y2 t 0 N 2/3 2.5Z2N (s) ds − Y4 Z2N (t) = Z2N (0) + N −2/3 Y1 −N −N −2/3 −2/3 t Y2 N 0 t Y6 N 0 t 0 0 2/3 0 N 2/3 0.25Z1N (s) ds , N 2/3 Z1N (s) ds 2.5Z2N (s) ds Z3N (t) = Z3N (0) + N −1 Y3 −N −1 Y6 t t 2/3 0.75Z2N (s)Z3N (s) ds t 0 , N 5/3 Z1N (s) ds − N −1 Y5 t 0 N 5/3 2Z3N (s) ds N 2/3 0.75Z2N (s)Z3N (s) ds . We assume that the initial value for Z2N is chosen to satisfy Z2 (0) = limN→∞ Z2N (0) ∈ (0, ∞). In this model, there are only two time-scales, so we set m1 = max{ρk − αi : ζik = 0} = max 2 3 − 0, 23 − 23 , 53 − 1, 23 − 1 = 23 , and we have r1,N = N 2/3 . We have ζ1,1 = 0, ζ1,2 = e1 , ζ1,3 = e3 , ζ1,4 = −e1 , ζ1,5 = −e3 , ζ1,6 = 0. The operator L1 = limN→∞ N −2/3 AN is given by L1 h(z) = λ2 (z) h(z + e1 ) − h(z) + λ4 (z) h(z − e1 ) − h(z) + λ3 (z) − λ5 (z) ∂z3 h(z) and note that for smooth h, (5.1) N −2/3 AN h = L1 h + O N −2/3 . 743 CLT FOR MULTI-SCALE MARKOV CHAINS Functions h ∈ ker(L1 ) are functions of the coordinate z2 only, E1 = R(S1 ) = span{e1 , e3 } and E0 = N (S1T ) = span{e2 }. Taking h ∈ D(L0 ) = C 1 (E0 ), L0 h(z) = lim AN h(z) = λ1 (z) − λ2 (z) − λ6 (z) ∂z2 h(z2 ). N→∞ Setting V0N = Z2N and V1N = (Z1N , Z3N ), the compensator for V0N is F N (z) = λ1 (z) − λ2 (z) − λ6 (z), so F (z) = F N (z) and G0 (z) ≡ 0 in Condition 2.5. The process corresponding to L1 is piecewise deterministic with Z1 discrete and Z3 continuous. For fixed z2 , with reference to Condition 2.4, the conditional equilibrium distribution satisfies (5.2) 2.5z2 g(z1 + 1, z3 ) − g(z1 , z3 ) + 0.25z1 g(z1 − 1, z3 ) − g(z1 , z3 ) + (z1 − 2z3 ) ∂g (z1 , z3 ) μz2 (dz1 , dz3 ) = 0. ∂z3 Note that the marginal for Z1 is Poisson(10z2 ), so z1 μz2 (dz1 , dz3 ) = 10z2 . Taking g(z1 , z3 ) = z3 in (5.2), we see z3 μz2 (dz1 , dz3 ) = 5z2 . These calculations imply that the averaged value for the drift F is F (z2 ) = λ1 (z) − λ2 (z) − λ6 (z) μz2 (dz1 , dz3 ) = 7.5z2 − 3.75z22 , with ∇F (z2 ) = 7.5 − 7.5z2 . For the current example, we will see that F and G in (2.15) can be obtained without explicitly computing with μz2 . With reference to (2.5), we look for a solution h1 to the Poisson equation (5.3) L1 h1 (z) = (z1 − 2.5z2 − 0.75z2 z3 ) − 7.5z2 − 3.75z22 . Trying h1 of the form h1 (z) = z1 u1 (z2 ) + z3 u3 (z2 ), we have L1 h1 (z) = u1 (z2 )(2.5z2 − 0.25z1 ) + u3 (z2 )(z1 − 2z3 ) and equating the factors multiplying z1 and z3 , we get u1 (z2 ) = 1.5z2 − 4 and u3 (z2 ) = 0.375z2 . Thus h1 (z) = z1 (1.5z2 − 4) + z3 (0.375z2 ) and H N (z) = N −2/3 h1 (z). 744 H.-W. KANG, T. G. KURTZ AND L. POPOVIC Since the solution of (5.3) is exact and (as we shall see) rN = N 1/3 , by (5.1), we have G1 = 0 in Condition 2.6. With reference to Condition 2.8, (2.9) and (2.10) are immediate. The only restriction that remains to determine rN is the asymptotic behavior of the quadratic variation of Z2N − H N (Z N ) = Z2N − N −2/3 h1 (Z N ). Direct calculation shows that to get a nontrivial G in (2.11) we must take rN = N 1/3 . We then have N 2/3 Z2N − H N Z N = 6 N −2/3 t 0 + t t 0 k=1 ≈ Z1N (s) ds + t 0 ζ2k + h1 Z N (s−) − h1 Z N (s−) + N ζk t 0 2 dRkN (s) 2 −1 − 1.5Z2N (s) + 4 2.5Z2N (s) ds 2 1.5Z2N (s) − 4 0.25Z1N (s) ds + t 0 0.75Z2N (s)Z3N (s) ds, where we observe that jumps by R3N and R5N do not contribute to the limit. Dividing the equation for Z1N by N 2/3 , we observe that t 0 Z1N (s) ds ≈ t 0 10Z2N (s) ds. Similarly, dividing the equation for Z3N by N 2/3 we see that t 0 Z3N (s) ds ≈ 1 2 t 0 Z1N (s) ds ≈ t 0 5Z2N (s) ds, which in turn implies t 0 Z2N (s)Z3N (s) ds ≈ t 0 5Z2N (s)2 ds. It follows that G(z2 ) is 10z2 + (3 − 1.5z2 )2 2.5z2 + (4 − 1.5z2 )2 2.5z2 + 3.75z22 = 72.5z2 − 48.75z22 + 11.25z23 . Let Z2 be the solution of Z2 (t) = Z2 (0) + and UN = N 1/3 (Z2N t 0 7.5Z2 (s) − 3.75Z22 (s) ds − Z2 ). Then supZ2N (s) − Z2 (s) ⇒ 0 and s≤t U N ⇒ U, CLT FOR MULTI-SCALE MARKOV CHAINS 745 where, for W a standard Brownian motion, U satisfies U (t) = U (0) + + t 0 t 0 72.5Z2 (s) − 48.75Z2 (s)2 + 11.25Z2 (s)3 dW (s) 7.5 − 7.5Z2 (s) U (s) ds. The corresponding diffusion approximation is D N (t) = Z2N (0) +N + −1/3 t 0 t 0 72.5D N (s) − 48.75D N (s)2 + 11.25D N (s)3 dW (s) 7.5D N (s) − 3.75D N (s)2 ds. We compare simulations for the original value of the amount of genome X2 (·) with the approximations given by the Gaussian approximation N 2/3 Z2 (·N −2/3 ) + N 1/3 U (·N −2/3 ), and the diffusion approximation N 2/3 D N (·N −2/3 ). For comparison we also give the deterministic value given by N 2/3 Z2 (·N −2/3 ). We use N = 1000 and a time interval on the scale γ = 2/3. The initial values are set to X1 (0) = X3 (0) = 0, X2 (0) = 10 and 500 realizations are performed for each of the three stochastic processes. Figure 1 shows the mean and one standard deviation above and below the mean for each of the three processes, and Figure 2 shows five trajectories for the three processes. For the diffusion process, these plots use only sample paths that hit one 2/3 (= 100/N0 ) before they hit zero. For small initial values, the diffusion approxi- F IG . 1. Mean and standard deviation of the amount of genome in the three species model [500 simulations with parameters N0 = 1000, γ = 2/3, X1 (0) = 0, X2 (0) = 10, X3 (0) = 0]. 746 H.-W. KANG, T. G. KURTZ AND L. POPOVIC F IG . 2. Five trajectories of the amount of genome in the three species model (same parameters as in Figure 1). mation does not give a good approximation of the probability of hitting zero (and hence absorbing at zero), before (e.g.) hitting one. Let τZN = inf t > 0 : Z2N (t) = 0 or Z2N (t) ≥ 1 747 CLT FOR MULTI-SCALE MARKOV CHAINS and τDN = inf t > 0 : D N (t) = 0 or D N (t) ≥ 1 . It is shown in Ball et al. (2006) that lim P Z N τZN = 0|Z N (0) = N −2/3 k = 4−k N→∞ while a standard calculation for the diffusion process gives lim P D N τDN = 0|D N (0) = N −2/3 k = e−(6/29)k . N→∞ 5.2. Michaelis–Menten enzyme model. A basic model for an enzymatic reaction includes three time-varying species, the substrate, the free enzyme and the substrate-bound enzyme, involved in three reactions: κ1 (1) S +E SE, (2) SE S + E, (3) SE P + E, κ2 κ3 with mass-action kinetics and with rate constants such that κ2 , κ3 κ1 . To be precise, let κ2 = κ2 N , κ3 = κ3 N , and κ1 = κ1 . We denote E, S, P as species 1, 2 and 3, respectively, and let Xi (t) be the number of molecules of species i in the system at time t. Note that the total number of unbound and substrate-bound enzyme molecules is conserved, and we let M denote this amount. The stochastic model is X1 (t) = X1 (0) − Y1 + Y3 t 0 t 0 κ1 X1 (s)X2 (s) ds + Y2 t 0 κ2 M κ2 M − X1 (s) ds κ3 M − X1 (s) ds , X2 (t) = X2 (0) − Y1 X3 (t) = X3 (0) + Y3 t 0 t 0 κ1 X1 (s)X2 (s) ds κ3 M + Y2 t 0 − X1 (s) ds , − X1 (s) ds . If the initial amount of substrate is O(N) M, then the normalizations of the species abundances are given by α1 = 0, α2 = 1, α3 = 1, and the scaling exponents for the rate constants are β1 = 0, β2 = 1, β3 = 1. 748 H.-W. KANG, T. G. KURTZ AND L. POPOVIC The normalized system becomes Z1N (t) = Z1N (0) − Y1 +Y3 t 0 t 0 Nκ3 M Z2N (t) = Z2N (0) − N −1 Y1 + N −1 Y2 t 0 Nκ1 Z1N (s)Z2N (s) ds − Z1N (s) ds t 0 t + Y2 0 Nκ2 M − Z1N (s) ds , Nκ1 Z1N (s)Z2N (s) ds Nκ2 M − Z1N (s) ds , Z3N (t) = Z3N (0) + N −1 Y3 t 0 Nκ3 M − Z1N (s) ds . Again, there are only two time-scales with the fast time-scale m1 = 1 giving r1,N = N . Then ζ1,1 = −e1 , ζ1,2 = ζ1,3 = e1 , and the operator L1 is given by L1 h(z) = κ1 z1 z2 h(z − e1 ) − h(z) + (κ2 + κ3 )(M − z1 ) h(z + e1 ) − h(z) , and for smooth h, N −1 AN h = L1 h + O N −1 . (5.4) Functions h ∈ ker(L1 ) are functions of coordinates z2 and z3 only. Thus E1 = {z1 e1 : z1 = 0, . . . , M} ⊂ R(S1 ) and E0 = N (S1T ) = {(z2 e2 , z3 e3 ) : z2 , z3 ≥ 0}. For h ∈ D(L0 ) = C 1 (E0 ), L0 h(z) = κ2 (M − z1 ) − κ1 z1 z2 ∂z2 h(z) + κ3 (M − z1 )∂z3 h(z). Taking V0N = (Z2N , Z3N ), the compensator for V0N in (2.3) is T F N (z) = κ2 (M − z1 ) − κ1 z1 z2 , κ3 (M − z1 ) , so F (z) = F N (z) and G0 (z) ≡ 0. On the fast time-scale, the process whose generator is L1 is a Markov chain on E1 describing the dynamics of an urn scheme with a total of M molecules, and for a fixed value of z2 , z3 , with transition rates κ1 z2 for outflow and κ2 + κ3 for inflow. Its stationary distribution μz2 ,z3 (z1 ) is binomial(M, p(z2 )) for κ 2 + κ3 , p(z2 ) = κ2 + κ3 + κ1 z2 so z1 μz2 ,z3 (dz1 ) = Mp(z2 ). This observation implies that the averaged value for the drift F is F (z2 , z3 ) = −M κ1 κ3 z2 κ1 κ3 z2 ,M κ2 + κ3 + κ1 z2 κ2 + κ3 + κ1 z2 = −κ3 M 1 − p(z2 ) 1 −1 T 749 CLT FOR MULTI-SCALE MARKOV CHAINS with κ1 κ3 (κ2 + κ3 ) ∇F = −M (κ2 + κ3 + κ1 z2 )2 1 0 , −1 0 and we need to solve the Poisson equation L1 h1 (z) = κ2 (M − z1 ) − κ1 z1 z2 + M κ1 κ3 z2 , κ2 + κ3 + κ1 z2 κ1 κ3 z2 κ3 (M − z1 ) − M κ2 + κ3 + κ1 z2 T for h1 . Trying h1 of the form h1 (z) = (z1 u1 (z2 ), z1 u2 (z2 ))T , we have L1 h1 (z) = −κ1 z1 z2 u1 (z2 ) + (κ2 + κ3 )(M − z1 )u1 (z2 ), −κ1 z1 z2 u2 (z2 ) T + (κ2 + κ3 )(M − z1 )u2 (z2 ) , and equating terms with the same power of z1 , we get u1 (z2 ) = (κ1 z2 + κ2 )/(κ1 z2 + κ2 + κ3 ) and u2 (z2 ) = κ3 /(κ1 z2 + κ2 + κ3 ). Note that u1 (z2 ) + u2 (z2 ) = 1. Thus h1 (z) = z1 κ3 z1 (κ1 z2 + κ2 ) , (κ1 z2 + κ2 + κ3 ) (κ1 z2 + κ2 + κ3 ) T T = z1 u1 (z2 ), 1 − u1 (z2 ) , and H N (z) = N −1 h1 (z). Examining the quadratic variation of V0N − H N ◦ V N , we see that rN must be 1/2 N , and by (5.4), it follows that G1 = 0 in (2.7). Finally, letting z⊗2 = zzT , N V0N − H N ◦ V N =N ≈ 3 t −1 k=1 0 t 0 + + ≈ u1 Z2N (s) 1 − + 0 1 − u1 Z2N (s) t ⊗2 u1 Z2N (s) 1 − 0 1 − u1 Z2N (s) 0 t u1 Z2N (s) 0 − 1 1 − u1 Z2N (s) 0 1 − u1 Z2N (s) 0 ζk + h1 Z N (s−) − h1 Z N (s−) + N ζk t 0 t 2 2 − 1 − u1 Z2N (s) dRkN (s) κ1 Z1N (s)Z2N (s) ds ⊗2 ⊗2 κ2 M − Z1N (s) ds κ3 M − Z1N (s) ds − 1 − u1 Z2N (s) ⊗2 2 2 1 − u1 Z2N (s) κ1 Z1N (s)Z2N (s) ds 750 H.-W. KANG, T. G. KURTZ AND L. POPOVIC t 2 1 − u1 Z2N (s) 2 − 1 − u1 Z2N (s) N N 2 N 2 κ2 M − Z1 (s) ds 0 − 1 − u1 Z2 (s) 1 − u1 Z2 (s) N 2 N 2 t u1 Z2 (s) −u1 Z2 (s) N + N 2 N 2 κ3 M − Z1 (s) ds, + 0 −u1 Z2 (s) u1 Z2 (s) and averaging Z1N gives lim N V0N − HN ◦ V N t = N→∞ = where Z = (Z2 , Z3 ) satisfies Z(t) = Z(0) + and t 0 t G Z(s) ds 0 t 0 κ1 κ3 Z2 (s) −1 M κ2 + κ3 + κ1 Z2 (s) 1 2 g(z2 ) = M 1 − u1 (z2 ) g Z2 (s) −g Z2 (s) −g Z2 (s) g Z2 (s) ds, ds κ1 p(z2 )z2 + κ2 1 − p(z2 ) + Mu1 (z2 )2 κ3 1 − p(z2 ) . Let U N = N 1/2 (Z2N − Z2 , Z3N − Z3 )T . Then sup Z2N (s) − Z2 (s), Z3N (s) − Z3 (s) ⇒ 0 and s≤t where U = (U2 , U3 )T satisfies U (t) = U (0) + + t 0 t 0 −1 1 g Z2 (s) dW (s) Mκ1 κ3 (κ2 + κ3 ) −1 U2 (s) 1 (κ2 + κ3 + κ1 Z2 (s))2 for W a standard scalar Brownian motion. The corresponding diffusion approximation is D2N (t) Z2N (0) = + N −1/2 D3N (t) Z3N (0) + t 0 U N ⇒ U, t 0 −1 1 ds g D2N (s) dW (s) κ1 κ3 D2N (s) −1 M N κ2 + κ3 + κ1 D2 (s) 1 ds. We compare simulations for 500 realizations of the original model with 500 1/2 realizations of the Gaussian approximation N0 Z2 (·) + N0 U2 (·), N0 Z3 (·) + 1/2 N N N0 U3 (·) and of the diffusion approximation N0 D2 0 (·), N0 D3 0 (·). For comparison we also give the deterministic value given by N0 Z2 (·), N0 Z3 (·). We use CLT FOR MULTI-SCALE MARKOV CHAINS 751 F IG . 3. Mean and standard deviation of the amount of substrate in the Michaelis–Menten model [500 simulations with parameters N0 = 100, γ = 0, X1 (0) = X3 (0) = 0, X2 (0) = 50, M = 5, κ1 = 0.1, κ2 = 500, κ3 = 100]. N0 = 100 and a time interval on the scale γ = 0. The initial values are set to X1 (0) = X3 (0) = 0, X2 (0) = 50 and M = 5, κ1 = 0.1, κ2 = 500 and κ3 = 100. Figure 3 shows the mean and one standard deviation above and below the mean for each of the three processes, and Figure 4 shows five trajectories for the three processes. In this example, both Gaussian and diffusion approximations give good approximations for the means and the standard deviations of the pair of processes X2 (·), X3 (·). 5.3. Another enzyme model. Another model for an enzymatic reaction includes an additional form for the enzyme which cannot bind to the substrate. There are now four species, substrate, active enzyme, enzyme-substrate complex and inactive enzyme, involved in five reactions: κ1 (1) S +E SE, (2) SE S + E, (3) SE P + E, (4) F E, κ2 κ3 κ4 κ5 (5) E F, with mass-action kinetics and rate constants such that κ1 = O(1), κ2 , κ3 = O(N), κ4 , κ5 = O(N 2 ) so that κ1 = κ1 , κ2 = κ2 N , κ3 = κ3 N , κ4 = κ4 N 2 , κ5 = κ5 N 2 . 752 H.-W. KANG, T. G. KURTZ AND L. POPOVIC F IG . 4. Five trajectories of the amount of substrate in the Michaelis–Menten model (parameters as in Figure 3). We denote E, S, F as species 1, 2 and 3, respectively, and let Xi (t) be the number of molecules of species i in the system at time t. The total number M of active, inactive and substrate-bound enzyme molecules is conserved. The stochas- 753 CLT FOR MULTI-SCALE MARKOV CHAINS tic model is X1 (t) = X1 (0) − Y1 + Y2 + Y3 − Y5 t 0 κ1 X1 (s)X2 (s) ds κ3 M κ2 M − X1 (s) − X3 (s) ds 0 t 0 t − X1 (s) − X3 (s) ds + Y4 t κ4 X3 (s) ds 0 κ5 X1 (s) ds , 0 X2 (t) = X2 (0) − Y1 + Y2 t t t 0 κ1 X1 (s)X2 (s) ds κ2 M − X1 (s) − X3 (s) ds , 0 X3 (t) = X3 (0) − Y4 t 0 κ4 X3 (s) ds + Y5 t κ5 X1 (s) ds . 0 If the initial amount of substrate is O(N) M, then the scaling exponents for the species abundances are α1 = 0, α2 = 1, α3 = 0, and the scaling exponents for the rate constants are β1 = 0, β2 = 1, β3 = 1, β4 = 2, The normalized system becomes Z1N (t) = Z1N (0) − Y1 + Y2 + Y3 − Y5 t 0 t 0 Nκ2 M − Z1N (s) − Z3N (s) ds Nκ3 M t − Z1N (s) − Z3N (s) ds + Y4 t N 0 2 κ5 Z1N (s) ds Z2N (t) = Z2N (0) − N −1 Y1 + N −1 Y2 Nκ1 Z1N (s)Z2N (s) ds t 0 β5 = 2. t Z3N (t) = Z3N (0) − Y4 0 t 0 N 0 2 κ4 Z3N (s) ds , Nκ1 Z1N (s)Z2N (s) ds Nκ2 M − Z1N (s) − Z3N (s) ds , t 0 N 2 κ4 Z3N (s) ds + Y5 t 0 N 2 κ5 Z1N (s) ds . 754 H.-W. KANG, T. G. KURTZ AND L. POPOVIC The fastest time-scale has m2 = 2 and r2,N = N 2 , with ζ2,4 = e1 − e3 , ζ2,5 = −e1 + e3 . The operator L2 is L2 h(z) = κ4 z3 h(z + e1 − e3 ) − h(z) + κ5 z1 h(z − e1 + e3 ) − h(z) , with ker(L2 ) consisting of functions of coordinates z2 and z1 +z3 only. To simplify our calculations we make a change of variables to (v0 , v1 , v2 ) = (z2 , z1 + z3 , z3 ), so in this system of variables ζ2,4 = e2 , ζ2,5 = − e2 with the operator L2 L2 h(v) = κ4 v2 h(v − e2 ) − h(z) + κ5 (v1 − v2 ) h(v + e2 ) − h(v) . Functions h(v) ∈ ker(L2 ) are now functions of v0 , v1 only. Thus E2 = R(S2 ) = span{ e2 } and E1 × E0 = N (S2T ) = span{ e1 , e0 }. The next time-scale has m1 = 1, r1,N = N and ζ1,1 = (0, −1), ζ1,2 = ζ1,3 = (0, 1). Also L1 h(v) = κ1 v0 (v1 − v2 ) h (v0 , v1 − 1) − h(v0 , v1 ) + (κ2 + κ3 )(M − v1 )(h (v0 , v1 + 1) − h(v0 , v1 ) with ker(L1 ) consisting of functions of v0 only. Thus E1 = R(S1 ) = span{ e1 } and e0 }. E1 = N (S1T ) = span{ Finally, L0 is L0 h(v) = −κ1 v0 (v1 − v2 )∂v0 h(v0 ) + κ2 (M − v1 )∂v0 h(v0 ). The conditional stationary distribution μv0 ,v1 (dv2 ) of Markov chain with gen κ5 erator L2 is such that ρ0 (v0 , v1 ) = v2 μv0 ,v1 (dv2 ) = κv41+κ , thus 5 L1 h(v) = κ1 v0 v1 κ4 h (v0 , v1 − 1) − h(v0 , v1 ) κ4 + κ5 + (κ2 + κ3 )(M − v1 )(h (v0 , v1 + 1) − h(v0 , v1 ) , which has conditional stationary distribution μv0 (dv1 ) such that M(κ4 + κ5 )(κ2 + κ3 ) , κ1 κ4 v0 + (κ4 + κ5 )(κ2 + κ3 ) Mκ5 (κ2 + κ3 ) . ρ2 (v0 ) = v2 μv0 ,v1 (dv2 )μv0 (dv1 ) = κ1 κ4 v0 + (κ4 + κ5 )(κ2 + κ3 ) ρ1 (v0 ) = v1 μv0 (dv1 ) = The compensator for the process V0N is F N (v) = κ2 (M − v1 ) − κ1 v0 (v1 − v2 ) = F (v), and averaging F gives F 1 (v0 , v1 ) = κ2 (M − v1 ) − κ1 v0 (v1 − ρ0 (v0 , v1 )), and F (v0 ) = κ2 M − ρ1 (v0 ) − κ1 v0 ρ1 (v0 ) − ρ2 (v0 ) =− Mκ1 κ3 κ4 v0 , κ1 κ4 v0 + (κ4 + κ5 )(κ2 + κ3 ) 755 CLT FOR MULTI-SCALE MARKOV CHAINS so ∇F (v0 ) = − Mκ1 κ3 κ4 (κ4 + κ5 )(κ2 + κ3 ) . (κ1 κ4 v0 + (κ4 + κ5 )(κ2 + κ3 ))2 Setting u1 (v0 ) = κ1 κ4 v0 + κ2 (κ4 + κ5 ) , κ1 κ4 v0 + (κ2 + κ3 )(κ4 + κ5 ) u2 (v0 ) = κ1 v0 , κ4 + κ5 and u3 (v0 ) = − κ1 v0 (κ1 κ4 v0 + κ2 (κ4 + κ5 ))κ1 v0 , u1 (v0 ) = − κ4 + κ5 (κ1 κ4 v0 + (κ2 + κ3 )(κ4 + κ5 ))(κ4 + κ5 ) the solutions to the Poisson equations are given by functions h1 (v) = v1 u1 (v0 ), h2 (v) = −v2 u2 (v0 ), h3 (v) = −v2 u3 (v0 ), and H N = N1 h1 + N12 (h2 + h3 ). Let rN = N 1/2 and observe that N12 (h2 + h3 ) makes a negligible contribution to the quadratic variation. Consequently, N V0N − H N ◦ V N ≈ 5 N −1 ≈ t 0 + t t 0 k=1 −1 + u1 V0N t 0 ζk2 + h1 V N (s−) − h1 V N (s−) + T N ζk 2 2 dRkN (s) κ1 V0N V1N − V2N ds 2 1 − u1 V0N κ2 M − V1N ds + t 0 2 u1 V0N κ3 M − V1N ds. Hence 2 G(v) = κ3 (κ4 + κ5 ) κ1 v0 (v1 − v2 ) + κ2 (M − v1 ) 2 + κ1 κ4 v0 + κ2 (κ4 + κ5 ) κ3 (M − v1 ) 2 / κ1 κ4 v0 + (κ2 + κ3 )(κ4 + κ5 ) and G(v0 ) = Mκ1 κ3 κ4 v0 (κ3 (κ4 + κ5 )2 (2κ2 + κ3 ) + (κ1 κ4 v0 + κ2 (κ4 + κ5 ))2 ) . (κ1 κ4 v0 + (κ2 + κ3 )(κ4 + κ5 ))3 If V0 is the solution of V0 (t) = V0 (0) − t 0 Mκ1 κ3 κ4 V0 (s) ds, κ1 κ4 V0 (s) + (κ4 + κ5 )(κ2 + κ3 ) 756 H.-W. KANG, T. G. KURTZ AND L. POPOVIC then, since G0 = G1 ≡ 0, U N = N 1/2 (V0N − V0 ) ⇒ U where U (t) = U (0) + − t 0 t 0 G V0 (s) dWs Mκ1 κ3 κ4 (κ4 + κ5 )(κ2 + κ3 ) U (s) ds. (κ1 κ4 V0 (s) + (κ4 + κ5 )(κ2 + κ3 ))2 The corresponding diffusion approximation is D N (t) = Z2N (0) + N −1/2 − t 0 t G D N (s) dW (s) 0 Mκ1 κ3 κ4 D N (s) ds. κ1 κ4 D N (s) + (κ4 + κ5 )(κ2 + κ3 ) Finally, we compare simulations for 500 realizations of the original model X2 1/2 with 500 realizations of the Gaussian approximation N0 V0 (·) + N0 U (·) and the N0 diffusion approximation N0 D2 (·). For comparison we also give the deterministic value given by N0 V0 (·). We use N0 = 100, a time interval on the scale γ = 0, and initial values are set to X1 (0) = X3 (0) = 0, X2 (0) = 50 as in the previous example. Here the additional parameters are set to M = 5, κ1 = 0.5, κ2 = 500, κ3 = 100 and κ4 = κ5 = 5000. Figure 5 shows the mean and one standard deviation above and below the mean for each of the three processes, and Figure 6 five trajectories for the three processes. Again, both Gaussian and diffusion approximations give a good approximation for the mean and the standard deviation from the mean of X2 (·). F IG . 5. Mean and standard deviations of the amount of substrate in the three time-scale enzyme model [500 simulations with N0 = 100, M = 5, γ = 0, X1 (0) = 0, X2 (0) = 50, X3 (0) = 0, κ1 = 0.5, κ2 = 500, κ3 = 100, κ4 = κ5 = 5000]. CLT FOR MULTI-SCALE MARKOV CHAINS 757 F IG . 6. Five trajectories for the amount of substrate in the three time-scale enzyme model (same parameters as in Figure 5). APPENDIX A.1. Martingale central limit theorem. Various versions of the martingale central limit have been given by McLeish (1974), Rootzén (1977, 1980), Gänssler 758 H.-W. KANG, T. G. KURTZ AND L. POPOVIC and Häusler (1979) and Rebolledo (1980) among others. The following version is from Ethier and Kurtz (1986), Theorem 7.1.4. T HEOREM A.1. (A.1) Let {Mn } be a sequence of Rd -valued martingales. Suppose lim E supMn (s) − Mn (s−) = 0 n→∞ s≤t and Mni , Mnj t → ci,j (t) for all t ≥ 0, where C = ((ci,j )) is deterministic and continuous. Then Mn ⇒ M, where M is Gaussian with independent increments and E[M(t)M(t)T ] = C(t). R EMARK A.2. Note that C(t) − C(s) is nonnegative definite for t ≥ s ≥ 0. If C is absolutely continuous, then the derivative will also be nonnegative definite and will have a nonnegative definite square root. Suppose Ċ(t) = σ (t)2 where σ is symmetric. Then M can be written as M(t) = t σ (s) dW (s), 0 where W is d-dimensional standard Brownian motion. REFERENCES BALL , K., K URTZ , T. G., P OPOVIC , L. and R EMPALA , G. (2006). Asymptotic analysis of multiscale approximations to reaction networks. Ann. Appl. Probab. 16 1925–1961. MR2288709 B HATTACHARYA , R. N. (1982). On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Z. Wahrsch. Verw. Gebiete 60 185–201. MR0663900 C OSTA , O. L. V. and D UFOUR , F. (2003). On the Poisson equation for piecewise-deterministic Markov processes. SIAM J. Control Optim. 42 985–1001 (electronic). MR2002143 DAVIS , M. H. A. (1993). Markov Models and Optimization. Monographs on Statistics and Applied Probability 49. Chapman & Hall, London. MR1283589 E THIER , S. N. and K URTZ , T. G. (1986). Markov Processes: Characterization and Convergence. Wiley, New York. MR0838085 E THIER , S. N. and NAGYLAKI , T. (1980). Diffusion approximations of Markov chains with two time scales and applications to population genetics. Adv. in Appl. Probab. 12 14–49. MR0552945 G ÄNSSLER , P. and H ÄUSLER , E. (1979). Remarks on the functional central limit theorem for martingales. Z. Wahrsch. Verw. Gebiete 50 237–243. MR0554543 G LYNN , P. W. and M EYN , S. P. (1996). A Liapounov bound for solutions of the Poisson equation. Ann. Probab. 24 916–931. MR1404536 H ASELTINE , E. L. and R AWLINGS , J. B. (2002). Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J. Chem. Phys. 117 6959–6969. K ANG , H.-W. and K URTZ , T. G. (2013). Separation of time-scales and model reduction for stochastic reaction networks. Ann. Appl. Probab. 23 529–583. MR3059268 K URTZ , T. G. (1971). Limit theorems for sequences of jump Markov processes approximating ordinary differential processes. J. Appl. Probab. 8 344–356. MR0287609 CLT FOR MULTI-SCALE MARKOV CHAINS 759 K URTZ , T. G. (1977/78). Strong approximation theorems for density dependent Markov chains. Stochastic Process. Appl. 6 223–240. MR0464414 K URTZ , T. G. (1992). Averaging for martingale problems and stochastic approximation. In Applied Stochastic Analysis (New Brunswick, NJ, 1991). Lecture Notes in Control and Inform. Sci. 177 186–209. Springer, Berlin. MR1169928 M C L EISH , D. L. (1974). Dependent central limit theorems and invariance principles. Ann. Probab. 2 620–628. MR0358933 R EBOLLEDO , R. (1980). Central limit theorems for local martingales. Z. Wahrsch. Verw. Gebiete 51 269–286. MR0566321 ROOTZÉN , H. (1977). On the functional central limit theorem for martingales. Z. Wahrsch. Verw. Gebiete 38 199–210. MR0482956 ROOTZÉN , H. (1980). On the functional central limit theorem for martingales. II. Z. Wahrsch. Verw. Gebiete 51 79–93. MR0566110 S RIVASTAVA , R., YOU , L., S UMMERS , J. and Y IN , J. (2002). Stochastic vs. deterministic modeling of intracellular viral kinetics. J. Theoret. Biol. 218 309–321. MR2026325 VAN K AMPEN , N. G. (1961). A power series expansion of the master equation. Canad. J. Phys. 39 551–567. MR0128921 H.-W. K ANG M ATHEMATICAL B IOSCIENCES I NSTITUTE O HIO S TATE U NIVERSITY 1735 N EIL AVE . C OLUMBUS , O HIO 43210 USA E- MAIL : kang235@mbi.osu.edu T. G. K URTZ D EPARTMENTS OF M ATHEMATICS AND S TATISTICS U NIVERSITY OF W ISCONSIN 480 L INCOLN D R . M ADISON , W ISCONSIN 53706 USA E- MAIL : kurtz@math.wisc.edu L. P OPOVIC D EPARTMENT OF M ATHEMATICS AND S TATISTICS C ONCORDIA U NIVERSITY 1450 DE M AISONNEUVE B LVD . W EST M ONTREAL , Q UEBEC H3G1M8 C ANADA E- MAIL : lpopovic@mathstat.concordia.ca