Strong Approximation for the Sums of Squares of Augmented GARCH Sequences Alexander Aue 1 István Berkes 2 Lajos Horváth 3 Abstract: We study so–called augmented GARCH sequences, which include many submodels of considerable interest, such as polynomial and exponential GARCH. To model the returns of speculative assets, it is particularly important to understand the behaviour of the squares of the observations. The main aim of this paper is to present a strong approximation for the sum of the squares. This will be achieved by an approximation of the volatility sequence with a sequence of blockwise independent random variables. Furthermore, we derive a necessary and sufficient condition for the existence of a unique (strictly) stationary solution of the general augmented GARCH equations. Also, necessary and sufficient conditions for the finiteness of moments are provided. AMS 2000 Subject Classification: Primary 60F15, Secondary 62M10. Keywords and Phrases: augmented GARCH processes, stationary solutions, strong approximation, moments, partial sums 1 Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112– 0090, USA, email: aue@math.utah.edu 2 Institut für Statistik, Technische Universität Graz, Steyrergasse 17/IV, A–8010 Graz, Austria, email: berkes@stat.tu-graz.ac.at 3 Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112– 0090, USA, email: horvath@math.utah.edu Corresponding author: A. Aue, phone: +1–801–581–5231, fax: +1–801–581–4148. Research partially supported by NATO grant PST.EAP.CLG 980599 and NSF–OTKA grant INT– 0223262 1 Introduction Starting with the fundamental paper of Engle (1982), autoregressive conditionally heteroskedastic ARCH processes and their generalizations have played a central role in finance and macroeconomics. One important class of these generalizations can be subsumed under the notion of augmented GARCH(1,1) processes introduced by Duan (1997). He studied random variables {yk : −∞ < k < ∞} satisfying the equations (1.1) yk = σk εk and (1.2) 2 Λ(σk2 ) = c(εk−1 )Λ(σk−1 ) + g(εk−1), where Λ(x), c(x) and g(x) are real–valued functions. To solve for σk2 , the definitions in (1.1) and (1.2) require that (1.3) Λ−1 (x) exists. Throughout the paper we assume that (1.4) {εk : −∞ < k < ∞} is a sequence of independent, identically distributed random variables. We start by giving a variety of examples which are included in the framework of the augmented GARCH model. Since all submodels must satisfy (1.1), only the corresponding specific equations (1.2) are stated. Moreover, it can be seen that (1.3) always holds true. Example 1.1 (GARCH) Bollerslev (1986) introduced the process 2 2 σk2 = ω + βσk−1 + αyk−1 h i 2 = ω + β + αε2k−1 σk−1 . Example 1.2 (AGARCH) This asymmetric model for volatility was defined by Ding, Granger and Engle (1993) as 2 σk2 = ω + βσk−1 + α(|yk−1| − µyk−1)2 i h 2 . = ω + β + α(|εk−1| − µεk−1 )2 σk−1 Example 1.3 (VGARCH) Engle and Ng (1993) used the equations 2 σk2 = ω + βσk−1 + α(εk−1 − µ)2 to study the impact of news on volatility. 2 Example 1.4 (NGARCH) This nonlinear asymmetric model was also introduced by Engle and Ng (1993): 2 2 σk2 = ω + βσk−1 + α(εk−1 − µ)2 σk−1 . Example 1.5 (GJR–GARCH) The model of Glosten, Jagannathan and Runke (1993) is given by 2 2 2 σk2 = ω + βσk−1 + α1 yk−1 I{yk−1 < 0} + α2 yk−1 I{yk−1 ≥ 0} h i 2 = ω + β + α1 ε2k−1 I{εk−1 < 0} + α2 ε2k−1 I{εk−1 ≥ 0} σk−1 . Example 1.6 (TSGARCH) This is a modification of Example 1.1 studied by Taylor (1986). σk = ω + βσk−1 + α1 |yk−1| = ω + [β + α1 |εk−1|] σk−1 . Example 1.7 (PGARCH) Motivated by a Box–Cox transformation of the conditional variance, Carrasco and Chen (2002) extensively studied the properties of the process satisfying δ σkδ = ω + βσk−1 + α|yk−1|δ i h δ . = ω + β + α|εk−1|δ σk−1 Example 1.8 (TGARCH) Example 1.5 is a nonsymmetric version of the GARCH process of Example 1.1. Similarly, a nonsymmetric version of Example 1.7 is δ + α1 |yk−1|δ I{yk−1 < 0} + α2 |yk−1|δ I{yk−1 ≥ 0} σkδ = ω + βσk−1 h i δ = ω + β + α1 |εk−1|δ I{εk−1 < 0} + α2 |εk−1|δ I{εk−1 ≥ 0} σk−1 . The special case of δ = 1 was proposed by Taylor (1986) and Schwert (1989) and includes the threshold model of Zakoian (1994). In Examples 1.1–1.8, Λ(x) = xδ , so these models are usually referred to as polynomial GARCH. Next, we provide two examples for exponential GARCH. Example 1.9 (MGARCH) The multiplicative model of Geweke (1986) is defined as 2 2 log σk2 = ω + β log σk−1 + α log yk−1 2 = ω + (α + β) log σk−1 + α log ε2k−1 . 3 Example 1.10 (EGARCH) The exponential GARCH model of Nelson (1991) is 2 log σk2 = ω + β log σk−1 + α1 εk−1 + α2 |εk−1|. Straumann and Mikosch (2005) obtained a necessary and sufficient condition for the existence and uniqueness of the solution of the recursive equations in Examples 1.1, 1.2 and 1.10. Their method is based on the theory of iterated functions. Carrasco and Chen (2002) studied the existence of solutions in the general framework of (1.1) and (1.2). Since they were also interested in the mixing properties of the sequences {yk } and {σk2 }, they assumed that |c(0)| < 1, E|c(ε0 )| < ∞ and E|g(ε0)| < ∞. We show that these conditions can be weakened to logarithmic moment conditions. The verification of the mixing property of solutions to (1.1) and (1.2) is based on the theory of hidden Markov models and geometric ergodicity, and it is also assumed that ε0 has a continuous density which is positive on the whole real line. The exponential mixing proved by Carrasco and Chen (2002) yields the weak convergence as well as the approximation of partial sums of yi2 with Brownian motion. However, for these results to hold true it should not be required to assume either the existence or smoothness of the density of ε0 . The main aim of this paper is to provide a general method on how to obtain strong invariance principles for partial sums of GARCH–type sequences. To this end, we show that {σk2 } can be approximated with a sequence of random variables {σ̃k2 } which are defined in a way that σ̃i2 and σ̃j2 are independent if the distance |i − j| is appropriately large. This is a generalization of the observation in Berkes and Horváth (2001) that the volatility in GARCH(p, q) models can be approximated with random variables which are blockwise independent. It is also known that the general theory of mixing provides a poor bound for the rate of convergence to Brownian motion. We connect this rate of convergence to the finiteness of moments of ε0 . The general blocking method used throughout the proofs yields strong invariance for the partial sums of the random variables {yk2 − Eyk2 } with very sharp rates. Our method can, however, also be used to obtain strong approximations for other functions of y1 , . . . , yn . Additionally, while we are focusing on processes of order (1,1) for the sake of presentation, adaptations of our methods work also for sequences of arbitrary order (p, q). Strong approximations of partial sums play an important role in, for instance, deriving results in asymptotic statistics. So, it has been pointed out by Horváth and Steinebach (2000) that limit results for so–called CUSUM and MOSUM–type test procedures, which are used to detect mean and variance changes, can be based on strong invariance principles. Theorem 2.4, in particular, might be used to test for the stability of the observed volatilities. In addition, strong approximations can also be used to develope sequential 4 monitoring procedures (cf. Berkes et al. (2004), Aue and Horváth (2004), Aue et al. (2005), and Horváth et al. (2005)). The paper is organized as follows. First, we discuss the existence of a unique stationary solution of the augmented GARCH model (1.1) and (1.2). We also find conditions for the existence of the moments of Λ(σ02 ) (cf. Theorems 2.1–2.3 below). These results will be utilized to provide strong approximations for the partial sums of {yk2 } in Theorem 2.4. The proofs of Theorems 2.1–2.3 are given in Section 4, while the strong approximation of Theorem 2.4 is verified in Section 5. In Section 3, we deal with applications of Theorem 2.4 in the field of change–point analysis. 2 Main results Solving (1.1) and (1.2) backwards, we get that, for any N ≥ 1, (2.1) 2 Λ(σk2 ) = Λ(σk−N ) Y c(εk−i) + 1≤i≤N X g(εk−i) 1≤i≤N Y c(εk−j ). 1≤j≤i−1 This suggests that the general solution of (1.1) and (1.2) is given by (2.2) Λ(σk2 ) = X g(εk−i) 1≤i<∞ Y c(εk−j ). 1≤j≤i−1 Our first result gives a necessary and sufficient condition for the existence of the random variable X Y X= g(ε−i) (2.3) c(ε−j ). 1≤i<∞ 1≤j≤i−1 Let log+ x = log(max{x, 1}). Theorem 2.1 We assume that (1.1)–(1.4) hold, (2.4) (i) If (2.5) E log+ |g(ε0)| < ∞ and E log+ |c(ε0 )| < ∞. E log |c(ε0)| < 0, then the sum in (2.3) is absolutely convergent with probability one. (ii) If (2.6) P {g(ε0) = 0} < 1 and the sum in (2.3) is absolutely convergent with probability one, then (2.5) holds. 5 We note that in (2.5) we allow that E log |c(ε0)| = −∞. The result bears some resemblence with Theorem 1.3 in Bougerol and Picard (1992), who concentrate, however, on positive processes and put more emphasis on the problem of irreducibility. The proof of Theorem 2.1 is given in Section 4. D Since under stationarity Λ(σk2 ) = X, the next result gives conditions for the existence of E|X|ν as well as of E|Λ(σk2 )|ν . Theorem 2.2 We assume that (1.1)–(1.4), (2.5) hold, ν > 0 and (2.7) E|g(ε0)|ν < ∞. (i) If (2.8) E|c(ε0 )|ν < 1, then (2.9) E|X|ν < ∞. (ii) If (2.6) and (2.9) are satisfied, (2.10) c(ε0 ) ≥ 0 and g(ε0) ≥ 0, then (2.8) holds. We note that in Examples 1.1–1.8 the left–hand side of the equations defining Λ(σk2 ) is always positive, so the right–hand side of those equations must also be positive. This requirement restricts the values of possible parameter choices such that (2.10) always holds in these examples. We also point out that Theorem 2.2 is an extension of the results in Nelson (1990) on the existence of the moments of a GARCH(1,1) sequence. The next result shows that (2.2) is the only solution of (1.1) and (1.2). We say that a strictly stationary solution of (1.1) and (1.2) is nonanticipative if, for any k, the random variable yk is independent of {εi : i > k}. According to Brandt (1986), the strictly stationary solution of (1.1) and (1.2) in part (i) of the following theorem is always nonanticipative. The same is true under the assumptions of part (ii) of the same theorem. Theorem 2.3 We assume that (1.1)–(1.4) and (2.4) are satisfied. (i) If (2.5) holds, then (2.2) is the only stationary solution of (1.1) and (1.2). (ii) If in addition (2.6), (2.10) hold and (1.1), (1.2) has a stationary, nonnegative solution, then (2.5) must be satisfied. 6 Theorem 2.3 gives, in particular, a necessary and sufficient condition for the existence of polynomial GARCH(1,1) and thus generalizes the result of Nelson (1990). The special case of EGARCH introduced in Example 1.10 is also covered by Straumann and Mikosch (2005). Here, stationarity can be characterized as follows. If |β| < 1 and E log+ |α1 ε0 + α2 |ε0 || < ∞, then there is a unique stationary, nonanticipative process satisfying the equation in Example 1.10. If β > 0 and P {α1ε0 + α2 |ε0 | ≥ 0} = 1, then β < 1 is necessary and sufficient for the existence of a stationary solution of the EGARCH equations. If we allow anticipative solutions, then (1.1) and (1.2) might have a solution even if (2.5) fails, i.e. if E log |c(ε0 )| ≥ 0 (cf. Aue and Horváth (2005)). Now, we consider approximations for S(k) = X 1≤i≤k (yi2 − Eyi2 ). Towards this end, we need that Λ−1 (x) is a smooth function. We assume that Λ(σ02 ) ≥ ω (2.11) Λ′ with some ω > 0, exists and is nonnegative, and there are C, γ such that (2.12) 1 ′ −1 Λ (Λ (x)) ≤ Cxγ for all x ≥ ω. Assumptions (2.11) and (2.12) hold for the processes in Examples 1.1–1.8, but fail for the remaining Examples 1.9 and 1.10. Thus, more restrictive moment conditions are needed in the exponential GARCH case (cf. Theorem 2.4(ii) below). Set σ̄ 2 = Var y02 + 2 (2.13) X Cov(y02 , yi2). 1≤i<∞ Theorem 2.4 We assume that (1.1)–(1.4), (2.5) hold, σ̄ > 0, E|ε0|8+δ < ∞ (2.14) with some δ > 0. (i) If (2.11) and (2.12) are satisfied and (2.7) holds for some ν > 4(1 + max{0, γ}), then there is a Wiener process {W (t) : t ≥ 0} such that (2.15) X 1≤i≤n (yi2 − Eyi2 ) − σ̄W (n) = o n5/12+ε a.s. for any ε > 0. (ii) If Λ(x) = log x, we assume that E exp(t|g(ǫ0)|) exists for some t > 4 and (2.16) there is 0 < c < 1 such that |c(x)| < c, then (2.15) holds. 7 The method of the proof of Theorem 2.4 can be used to establish strong approximations for the sums of functionals of the yi, too. For example, approximations can be P P P 2 2 − E[yi2yi+k ]), derived for 1≤i≤n (yi − Eyi ), 1≤i≤n (yiyi+k − E[yi yi+k ]) and 1≤i≤n (yi2yi+k where k is a fixed integer. Examples of strong approximations for dependent sequences other than augmented GARCH processes may be found in Eberlein (1986) and Kuelbs and Philipp (1980). While we are using a blocking argument, the methods in the afore mentioned papers are based on examining conditional expectations and variances, and mixing properties, respectively. 3 Applications In this section, we illustrate the usefulness of Theorem 2.4 with some applications coming from change–point analysis. One of the main concerns in econometrics is to decide whether the volatility of the underlying variables is stable over time or if it changes in the observation period. While, in general, this allows to include larger classes of random processes, we will focus on augmented GARCH sequences here. To accomodate the setting, we assume that the volatilities {σk2 } follow a common pattern until an unknown time k ∗ , called the change–point. After k ∗ a different regime takes over. It will be demonstrated in the following that, using Theorem 2.4, some of the basic detection methods can be applied to the observed volatilities {yk2 }. Application 3.1 The popular CUSUM procedure is based on the test statistics X 2 1 k X 2 y . Tn = √ yi − max nσ̄ 1≤k≤n 1≤i≤k n 1≤i≤n i Under the assumption of no change in the volatility it holds (3.17) D Tn −→ sup |B(t)|, 0≤t≤1 where {B(t) : 0 ≤ t ≤ 1} denotes a Brownian bridge. Theorem 2.4 not only gives an alternative way to prove (3.17), but also provides the strong upper bound O(n−1/12+ε ) for the rate of convergence. Application 3.2 A modification of Application 3.1 is given by the so–called MOSUM procedure. (Therein, MOSUM is the abbreviation for moving sum.) Its basic properties have been collected in Csörgő and Horváth (1997, p. 181). In our case, the test statistic becomes X X a(n) 1 2 2 max yi . yi − Vn = q n 1≤i≤n a(n)σ̄ 1≤k≤n−a(n) k<i≤k+a(n) 8 Let, in addition to the assumptions of Theorem 2.4, the conditions q n5/12+ε log(n/a(n)) (3.18) q a(n) = O(1) for some τ > 0 and a(n)/n → 0, as n → ∞, be satisfied. Then, we obtain the following extreme value asymptotic for Vn : ( ! !) and D(x) = 2 log x + n n lim P A Vn ≤ t + D n→∞ a(n) a(n) (3.19) = exp −e−t for all t, where A(x) = q 2 log x 1 1 log log x − log π. 2 2 We shall give a proof of (3.19) now. Indeed, on using Theorem 2.4, we obtain 1 (3.20) Vn = q a(n) max W (k 1≤k≤n−a(n) a(n) n5/12+ε + a(n)) − W (k) − W (n) + OP q n a(n) as n → ∞. The moduli of continuity of the Wiener process (cf. Csörgő and Révész (1981, p. 30) yield, as n → ∞, 1 max |W (k + a(n)) − W (k)| a(n) 1≤k≤n−a(n) (3.21) q 1 sup |W (t + a(n)) − W (t)| + OP = q a(n) 0≤t≤n−a(n) s log n a(n) ! By the scale transformation we obtain for the functional of the Wiener process on the right–hand side of the latter equation (3.22) 1 D sup q a(n) 0≤t≤n−a(n) |W (t + a(n)) − W (t)| = sup 0≤s≤n/a(n)−1 |W (s + 1) − W (s)|. Hence, Theorem 7.2.4 of Révész (1990, p. 72) gives the following extreme value asymptotic for the Wiener process. It holds, ( n (3.23) lim P A n→∞ a(n) ! n sup |W (s + 1) − W (s)| ≤ t + D a(n) 0≤s≤n/a(n)−1 !) On combining (3.20)–(3.22) with condition (3.18), we arrive at ! n n D A Vn = A a(n) a(n) ! sup 0≤s≤n/a(n)−1 9 = exp e−t . |W (s + 1) − W (s)| + oP (1), so that (3.19) follows from (3.23). Application 3.3 It has been pointed out by Csörgő and Horváth (1997) that weighted versions of the CUSUM statistics have a better power for detecting changes that occur either very early or very late in the observation period. Therefore, Tn from Application 3.1 can be transformed to the weighted statistics 1 Tn,α = √ nσ̄ α X n max yi2 1≤k≤n k 1≤i≤k − k n 2 yi , 1≤i≤n X where 0 ≤ α < 1/2. Under the assumptions of Theorem 2.4 it holds D |B(t)| tα 0<t<1 Tn,α −→ sup (3.24) (n → ∞), where {B(t) : 0 ≤ t ≤ 1} is a Brownian bridge. Theorem 2.4 and the moduli of continuity of the Wiener process (cf. Csörgő and Révész (1981, p. 30)) imply that X (3.25) (yi2 − Ey02) − σ̄W (t) = o t5/12+ε 1≤i≤t a.s. as t → ∞. Hence, 1 1 √ sup α n 1/n≤t≤1 t = OP − Ey02) − σ̄W (nt) ! −1/12+ε 5/12+ε O n P (nt) = α t OP nα−1/2 X (yi2 1≤i≤nt 1 √ sup n 1/n≤t≤1 giving that D if α ≤ 5/12 + ε, if 5/12 + ε < α < 1/2, 1 |W (t) − tW (1)| + oP (1). α 1/n≤t≤1 t Tn,α = sup By the almost sure continuity of t−α W (t), we obtain 1 1 |W (t) − tW (1)| −→ sup |W (t) − tW (1)| α α 0<t≤1 t 1/n≤t≤1 t sup a.s. Since {W (t) − tW (1) : 0 ≤ t ≤ 1} and {B(t) : 0 ≤ t ≤ 1} have the same distribution (3.24) holds. Application 3.4 Instead of the weighted sup–norm, also weighted L2 functionals can be used. Let 2 Z 1 X 2 1 X 2 1 yi dt. y −t Zn = nσ̄ 2 0 t 1≤i≤nt i 1≤i≤n 10 Under the conditions of Theorem 2.4 we have that D Zn −→ (3.26) Z 1 0 B 2 (t) dt t (n → ∞), where {B(t) : 0 ≤ t ≤ 1} denotes a Brownian bridge. Using (3.25), we conclude 1Zt1 (W (nt) − tW (n))2 dt + OP n−1/12+ε . Zn = n 0 t So, with B(t) = n−1/2 (W (nt) − tW (n)), we immediately get (3.26). For on–line monitoring procedures designed to detect changes in volatility confer Aue et al. (2005). All applications contain the parameter σ̄, which is in general unknown and so it has to be estimated from the sample data y1 , . . . , yn . This can be achieved easily by imposing the method developed by Lo (1991). For further results concerning the estimation of the variance of sums of dependent random variables we refer to Giraitis et al. (2003) and Berkes et al. (2005). 4 Proofs of Theorems 2.1–2.3 Proof of Theorem 2.1. Part (i) is an immediate consequence of Brandt (1986). The first step in the proof of (ii) is the observation that by (2.6) there is a constant a > 0 such that P {|g(ε0)| > a} > 0. Define the events Ai = |g(ε−i)| ≥ a, Y 1≤j≤i−1 We introduce the σ–algebras F0 = {∅, Ω} (4.1) and i ≥ 1. |c(ε−j )| ≥ 1 , Fi = σ({εj : −i ≤ j ≤ 0}), i ≥ 1. Clearly, Ai ∈ Fi for all i ≥ 1. Also, by standard arguments, X 1≤i<∞ P {Ai |Fi−1 } = P {|g(ε−i)| ≥ a} = P {|g(ε0)| ≥ a} We claim that X 1≤i<∞ I X 1≤j≤i−1 X P 1≤i<∞ X 1≤i<∞ Y 1≤j≤i−1 X I 1≤j≤i−1 log |c(ε−j )| ≥ 0 = ∞ 11 |c(ε−j )| ≥ 1Fi−1 log |c(ε−j )| ≥ 0 . a.s. If E log |c(ε0 )| > 0, then this follows from the strong law of large numbers, while the Chung–Fuchs law (cf. Chow and Teicher (1988, p. 148)) applies if E log |c(ε0)| = 0. Hence, Corollary 3.2 of Durrett (1986, p. 240) gives P {Ai infinitely often } = 1, and therefore P lim g(ε−i) i→∞ Y 1≤j≤i−1 c(ε−j ) = 0 = 0. This contradicts that the sum in (2.3) is finite with probability one. 2 Proof of Theorem 2.2. If ν ≥ 1, then by Minkowski’s inequality we have E|X|ν ≤ ν 1/ν ν Y X E g(ε−i) c(ε−j ) 1≤i<∞ 1≤j≤i−1 = E|g(ε0)|ν X 1≤i<∞ ν [E|c(ε0 )|ν ](i−1)/ν < ∞ by assumptions (2.7) and (2.8). If 0 < ν < 1, then (cf. Hardy, Littlewood and Pólya (1959)) E|X|ν ≤ ν Y E g(ε−i) c(ε−j ) 1≤i<∞ 1≤j≤i−1 = E|g(ε0)|ν X X 1≤i<∞ [E|c(ε0 )|ν ]i−1 < ∞, completing the proof of the first part of Theorem 2.2. If ν ≥ 1, then by the lower bound in the Minkowski inequality (cf. Hardy, Littlewood and Pólya (1959)) and (2.10) we get E|X|ν ≥ X 1≤i<∞ E g(ε−i) Y 1≤j≤i−1 ν c(ε−j ) = E[g(ε0 )]ν X (E[c(ε0 )]ν )i−1 . 1≤i<∞ Since by (2.6) and (2.7) we have 0 < E[g(ε0 )]ν < ∞, we get (2.8). Similarly, if 0 < ν < 1, then E|X|ν ≥ X E g(ε−i) 1≤i<∞ = E[g(ε0 )]ν X 1≤i<∞ and therefore (2.8) holds. Y 1≤j≤i−1 ν 1/ν ν c(ε−j ) ν (E[c(ε0 )]ν )(i−1)/ν 2 Proof of Theorem 2.3. The first part is an immediate consequence of Brandt (1986). 12 The second part is based on (2.1) with k = 0. Since by (2.10) all terms in (2.1) are nonnegative, we get for all N ≥ 1 that Λ(σ02 ) ≥ (4.2) X g(ε−i) Y c(ε−j ). 1≤j≤i−1 1≤i≤N If E log c(ε0 ) ≥ 0, then the proof of Theorem 2.1(ii) gives that Y lim sup g(ε−i) i→∞ 1≤j≤i−1 c(ε−j ) ≥ a with some a > 0. So by (2.10) we obtain that lim N →∞ X g(ε−i) Y 1≤j≤i−1 1≤i≤N c(ε−j ) = ∞ a.s., contradicting (4.2) 5 2 Proof of Theorem 2.4 Since the proof of Theorem 2.4 is divided into a series of lemmas, we will first outline its structure. The main ideas can be highlighted as follows. • First goal of the proof is to show that the partial sums of {yk2} can be approximated by the partial sums of a sequence of random variables {ỹk2} which are independent at sufficiently large lags. To do so, we establish exponential inequalities for the distance between the volatilities {σk2 } and a suitably constructed sequence {σ̃k2 }. Consequently, letting ỹk = σ̃k εk , it suffices to find an approximation for the partial P sums nk=1 (ỹk2 − E ỹk2). Detailed proofs are given in Lemmas 5.1–5.4 below. P • Set ξ˜k = ỹk2 − E ỹk2 . The partial sum nk=1 ξ˜k will be blocked into random variables belonging to two subgroups. The random variables associated with each subgroup form an independent sequence which allows the application of the following simplified version of Theorem 4.4 in Strassen (1965, p. 334). Theorem A. Let {Zi } be a sequence of independent centered random variables and let 0 < κ < 1. If r(N) = X 1≤i≤N EZi2 −→ ∞ (N → ∞), X 1≤N <∞ r(N)−2κ EZN4 < ∞, then there is a Wiener process {W (t) : t ≥ 0} such that X 1≤i≤N Zi − W (r(N)) = o r(N)(κ+1)/4 log r(N) 13 a.s. (N → ∞). Lemmas 5.5–5.9 help proving that the assumptions of Theorem A are satisfied, while the actual approximations are given in Lemmas 5.10–5.12. Let 0 < ρ < 1 and define σ̃k2 as the solution of Λ(σ̃k2 ) = X g(εk−i) 1≤i≤k ρ Y c(εk−j ). 1≤j≤i−1 First, we obtain an exponential inequality for Λ(σk2 ) − Λ(σ̃k2 ). Lemma 5.1 If (1.1)–(1.4), (2.5), (2.7) and (2.8) hold, then there are constants C1,1 , C1,2 and C1,3 such that n o P |Λ(σk2 ) − Λ(σ̃k2 )| ≥ exp (−C1,1 k ρ ) ≤ C1,2 exp (−C1,3 k ρ ) . Proof. Observe that Λ(σk2 ) − Λ(σ̃k2 ) = X Y g(εk−i) k ρ +1≤i<∞ c(εk−j ) = Y 2 c(εk−i)Λ(σk−k ρ ). 1≤i≤k ρ 1≤j≤i−1 Using Theorem 2.7 of Petrov (1995) yields P Y 1≤i≤k ρ |c(εk−i)| ≥ exp (−C1,4 k ρ ) = P X 1≤i≤k ρ (log |c(εk−i)| − a) ≥ −(C1,4 + a)k ρ ≤ 2 exp (−C1,5 k ρ ) , where E log |c(ε0 )| = a < 0 and 0 < C1,4 < −a. Theorem 2.2 implies P |Λ(σk2) − Λ(σ̃k2 )| ≥ exp − ≤ P Y 1≤i≤k ρ C1,4 ρ k 2 2 |c(εk−i)| ≥ exp (−C1,4 k ρ ) + P |Λ(σk−k ρ )| ≥ exp ≤ 2 exp (−C1,5 k ρ ) + exp − C1,4 ρ k 2 νC1,4 ρ k E|Λ(σ02 )|ν ≤ C1,6 exp (−C1,7 k ρ ) . 2 This completes the proof. 2 Lemma 5.2 If (1.1)–(1.4), (2.5), (2.7), (2.8), (2.11) and (2.12) hold, then there are constants C2,1 , C2,2 and C2,3 such that n o P |σk2 − σ̃k2 | ≥ exp (−C2,1 k ρ ) ≤ C2,2 exp (−C2,3 k ρ ) . 14 Proof. By the mean–value theorem, 1 σk2 − σ̃k2 = Λ−1 (Λ(σk2 )) − Λ−1 (Λ(σ̃k2 )) = Λ′ (Λ−1 (ξk )) Λ(σk2 ) − Λ(σ̃k2 ) , where ξk is between Λ(σk2 ) and Λ(σ̃k2 ). By (2.12), |σk2 − σ̃k2 | ≤ C |Λ(σk2 )|γ + |Λ(σ̃k2 )|γ Λ(σk2 ) − Λ(σ̃k2 ) . (5.1) If γ ≤ 0, then Lemma 5.2 follows from (2.11) in combination with Lemma 5.1. If γ > 0, then Theorem 2.2 and Lemma 5.1 yield P |Λ(σk2)|γ ≤ P + |Λ(σ̃k2)|γ Λ(σk2 ) − Λ(σ̃k2 ) n Λ(σk2 ) − Λ(σ̃k2 ) ρ ≤ 2C1,2 exp (−C1,3 k ) + P +P |Λ(σk2)| o ≥ exp (−C1,1 k ) + P ρ ( C1,1 ρ ≥ exp − k 2 ( |Λ(σk2 )|ν ρ + exp (−C1,1 k ) ν |Λ(σk2 )|γ C1,1 ρ 1 exp k ≥ 2 2 C1,1 ρ 1 exp k ≥ 2 2 + |Λ(σ̃k2 )|γ C1,1 ρ ≥ exp k 2 ν/γ ) ν/γ ) ≤ C2,4 exp (−C2,5 k ρ ) , finishing the proof. 2 Lemma 5.3 If (1.1)–(1.4), (2.5), (2.7), (2.8) hold and Λ(x) = log x, then there are constants C3,1 and C3,2 such that n o P |σk2 − σ̃k2 | ≥ exp (−C3,1 k ρ ) ≤ C3,2 k −ρν . 2 2 Proof. By the mean–value theorem, σk2 − σ̃k2 = elog σk − elog σ̃k = eξk (log σk2 − log σ̃k2 ), where ξk is between log σk2 and log σ̃k2 . Hence, 2 2 |σk2 − σ̃k2 | ≤ elog σk + elog σ̃k | log σk2 − log σ̃k2 |. (5.2) On the set −1 < log σk2 − log σ̃k2 < 1, we have exp(log σ̃k2 ) ≤ 3 exp(log σk2 ), so by Theorem 2.2 and Lemma 5.1, we obtain P |σk2 − σ̃k2 | C1,1 ρ ≥ exp − k 2 log σk2 ≤ P 3e | log σk2 − log σ̃k2 | C1,1 ρ k ≥ exp − 2 15 + C1,2 exp(−C1,3 k ρ ) n ≤ P | log σk2 − log σ̃k2 | ≥ exp (−C1,1 k ρ ) log σk2 +P 3e C1,1 ρ k ≥ exp 2 ρ ≤ C1,2 exp (−C1,3 k ) + P ρ ≤ C1,2 exp (−C1,3 k ) + P + C1,2 exp(−C1,3 k ρ ) C1,1 ρ ≥ k − log 3 + C1,2 exp(−C1,3 k ρ ) 2 log σk2 ( o | log σk2 |ν C1,1 ρ ≥ k − log 3 2 ν ) + C1,2 exp(−C1,3 k ρ ) ≤ C3,3 k −ρν . This completes the proof. 2 Let −∞ < i < ∞. ỹi = σ̃i εi , It is clear that ỹi and ỹj are independent if i < j − j ρ . The next result shows that it is enough to approximate the sum of the ỹi2 . Lemma 5.4 If the conditions of Theorem 2.4 are satisfied, then X yi2 max 1≤k<∞ 1≤i≤k Proof. Condition (2.14) implies − ỹi2 1≤i≤k X |εk | = o k 1/(8+δ) as k → ∞. By Lemmas 5.2 and 5.3 we get = O(1) a.s. a.s. |σk2 − σ̃k2 | = O (exp(−C4,1 k ρ )) a.s. Thus we conclude X 2 yi max 1≤k<∞ 1≤i≤k 2 − ỹi 1≤i≤k X ≤ X yi2 − ỹi2 1≤i<∞ = X σi2 1≤i<∞ − σ̃i2 ε2i X a.s. a.s. = O(1) i2/(8+δ) exp(−C4,1 iρ ) = O(1), 1≤i<∞ and the proof is complete. 2 We note that under the conditions of Theorem 2.4 we have (5.3) o n P |σk2 − σ̃k2 | ≥ exp(−C4,2 k ρ ) ≤ C4,3 k −µ , where µ = ρν when Λ(x) = log x while µ can be chosen arbitrarily when (2.11) and (2.12) hold. In both cases µ can be chosen arbitrarily large. 16 Lemma 5.5 If the conditions of Theorem 2.4 are satisfied, then E(σ̃k2 )4+δ = O(1) with some δ > 0 (5.4) and E(σk2 − σ̃k2 )2 = O k −µ/2 , where µ is defined in (5.3). Proof. We first show that E|σk2 − σ̃k2 |4+δ = O(1). Assume first that (2.11) and (2.12) are satisfied. Using (5.1), we get in case of γ > 0 that |σk2 − σ̃k2 |4+δ ≤ C |Λ(σk2 )|(4+δ)(1+γ) + |Λ(σ̃k2 )|(4+δ)(1+γ) +|Λ(σk2)|(4+δ)γ |Λ(σ̃k2 )|4+δ +|Λ(σk2 )|4+δ |Λ(σ̃k2 )|(4+δ)γ . It follows from our assumptions that E|Λ(σk2 )|(4+δ)(1+γ) ≤ C5,1 . Applying the defintion of σ̃k2 , we obtain from the proof of Theorem 2.2 E|Λ(σ̃k2 )|(4+δ)(1+γ) ≤ E X 1≤i≤k ρ ≤ E X 1≤i≤∞ |g(εk−i)| |g(εk−i)| Y 1≤j≤i−1 Y 1≤j≤i−1 (4+δ)(1+γ) |c(εk−j )| (4+δ)(1+γ) |c(εk−j )| ≤ C5,2 . Similarly, |Λ(σk2 )|(4+δ)γ |Λ(σ̃k2)|4+δ ≤ ≤ X 1≤i≤∞ X 1≤i≤∞ |g(εk−i)| |g(εk−i)| Y 1≤j≤i−1 Y 1≤j≤i−1 (4+δ)γ |c(εk−j )| X 1≤i≤k ρ |g(εk−i)| Y 1≤j≤i−1 (4+δ)(1+γ) 4+δ |c(εk−j )| |c(εk−j )| and therefore E|Λ(σk2 )|(4+δ)γ |Λ(σ̃k2)|4+δ ≤ C5,3 . The same argument implies in a similar way E|Λ(σk2 )|4+δ |Λ(σ̃k2 )|(4+δ)γ ≤ C5,4 . If γ < 0, then (5.1) and (2.11) yield 4+δ |σk2 − σ̃k2 |4+δ ≤ C5,5 Λ(σk2 ) − Λ(σ̃k2 ) and similar as before E|Λ(σk2 )|4+δ ≤ C5,6 as well as E|Λ(σ̃k2 )|4+δ ≤ C5,6 . 17 If Λ(x) = log x, then by (5.2) we have 2 2 |σk2 − σ̃k2 |4+δ ≤ C5,7 e(4+δ) log σk + e(4+δ) log σ̃k | log σk2 |4+δ + | log σ̃k2 |4+δ . So, it is enough to prove that 2 Eeη log σk ≤ C5,8 (5.5) and 2 Eeη log σ̃k ≤ C5,8 with some η > 4 and C5,8 = C5,8 (η). By assumption (2.16), X Y g(εk−i) c(εk−j ) i≤i<∞ 1≤j≤i−1 |Λ(σk2 )| ≤ ≤ X 1≤i<∞ ci−1 |g(εk−i)|. Now M(t) = E exp(t|g(ε0)|) exists for some t > 4 and therefore E exp(t|Λ(σk2 )|) ≤ Y 1≤i<∞ M(tci−1 ) = exp X 1≤i<∞ log M(tci−1 ) . Since 0 < c < 1, we see that tci → 0 as i → ∞. For any x > 0, log(1 + x) ≤ x and M(x) = 1 + O(x) as x → 0, so we get the first part of (5.5). A similar argument applies to the second statement. For k ≥ 1, define the events Ak = {|σk2 − σ̃k2 | ≤ exp(−C5,9 k ρ )} and write (σk2 − σ̃k2 )2 = (σk2 − σ̃k2 )2 I{Ak } + (σk2 − σ̃k2 )2 I{Ack }. Clearly, E[(σk2 − σ̃k2 )2 I{Ak }] = O (exp(−C5,9 k ρ )) . Applying (5.3), (5.4) and the Cauchy–Schwarz inequality, we arrive at h i E (σk2 − σ̃k2 )2 I{Ack } ≤ E|σk2 − σ̃k2 |4 1/2 P (Ack )1/2 = O k −µ/2 , completing the proof of Lemma 5.5. 2 Lemma 5.6 If the conditions of Theorem 2.4 are satisfied, then X Cov(y02, yk2 ) 1≤k<∞ is absolutely convergent. 18 Proof. Since y0 and ỹk are independent for all k > 1, we obtain E(y02 − Ey02 )(yk2 − Eyk2 ) = E(y02 − Ey02 )(yk2 − ỹk2 + ỹk2 − Eyk2 ) = E(y02 − Ey02 )(ỹk2 − Eyk2 ) + E(y02 − Ey02 )(yk2 − ỹk2) = E(y02(yk2 − ỹk2)) − Ey02E(yk2 − ỹk2). Since yk2 − ỹk2 = (σk2 − σ̃k2 )ε2k and the random variablesσk2 − σ̃k2 , ε2k are independent, Lemma 5.5 and Cauchy’s inequality give |Cov(y02, yk2 )| = O k −µ/4 , so Lemma 5.6 is proved on account of µ/4 > 1. 2 Lemma 5.7 If the conditions of Theorem 2.4 are satisfied, then there is a constant C7,1 such that 2 E for all n, m ≥ 1. (ỹk2 n<k≤n+m X − E ỹk2) ≤ C7,1 m Proof. Set ξk = yk2 − Eyk2 and ξ˜k = ỹk2 − E ỹk2. It is easy to see that E X n<k≤n+m 2 ξ˜k X = X Eξk ξl + n<k,l≤n+m n<k,l≤n+m E ξ˜k (ξ˜l − ξl ) + X n<k,l≤n+m Eξl (ξ˜k − ξk ). Lemma 5.5 and Cauchy’s inequality imply X n<k,l≤n+m E ξ˜k (ξ˜l − ξl ) = E ξ˜k2 E(ξ˜l − ξl )2 X n<k,l≤n+m X = E σ̃k4 Eε4k n<k,l≤n+m = O(1) X n<k,l≤n+m 1/2 1/2 Eε4l E(σl2 − σ̃l2 )2 l−µ/4 = O(1)m X n<l<∞ 1/2 l−µ/4 = O(1)m. Similarly, X n≤k,l≤n+m E ξl (ξ˜k − ξk ) = O(1)m. By the stationarity of {ξk : −∞ < k < ∞}, we have X Eξk ξl = n≤k,l≤n+m X n≤k,l≤n+m by an application of Lemma 5.6. Eξ0 ξ|k−l| = O(1)m 2 19 Lemma 5.8 If the conditions of Theorem 2.4 are satisfied, then there are constants n0 , m0 and C8,1 > 0 such that X E 2 (ỹk2 − E ỹk2) ≥ C8,1 m n<k≤n+m for all n ≥ n0 and m ≥ m0 . Proof. Following the proof of Lemma 5.7, we get 2 X E ξ˜k n<k≤n+m Since − Eξk ξl n<k,l≤n+m X ≤ C8,2 m X l−µ/4 . n<l<∞ X X X 1 1 Eξk ξl = Eξ0 ξ|k−l| −→ Var y02 + 2 Cov(y02, yk2) = σ̄ > 0, m n<k,l≤n+m m n<k,l≤n+m 1≤k<∞ the proof is complete. 2 Lemma 5.9 If the conditions of Theorem 2.4 are satisfied, then there is a constant C9,1 such that 4 (ỹk2 n<k≤n+m X E (5.6) for all n, m ≥ 1. − E ỹk2 ) ≤ C9,1 m2 Proof. We prove (5.6) using induction. If m = 1, then by (5.4) we have E(ỹk2 − E ỹk2 )4 ≤ C9,2 , for all k ≥ 1. We assume that (5.6) holds for m and for all n. We have to show that E (5.7) X n<k≤n+m+1 4 ξ˜k ≤ C9,1 (m + 1)2 for all n. Towards this end, let ⌊·⌋ denote integer part, m± (ρ) = m/2 ± 2mρ and set n+⌊m− (ρ)⌋ S1 (n, m) = X ξ˜k , S2 (n, m) = k=n+1 n+m+1 X ξ˜k , n+⌊m+ (ρ)⌋ S3 (n, m) = k=n+⌊m+ (ρ)⌋+1 ξ˜k . X k=n+⌊m− (ρ)⌋ By the triangle inequality, we have E X 1≤j≤3 4 1/4 Sj (n, m) ≤ E [S1 (n, m) + S2 (n, m)]4 20 1/4 + E [S3 (n, m)]4 1/4 . Also, by (5.4), E [S3 (n, m)]4 1/4 ≤ C9,3 mρ . Let Ak = {|σk2 − σ̃k2 | ≤ exp(−C9,4 k ρ )} and write ξ˜k = ξk + (ξ˜k −ξk )I{Ak } + (ξ˜k −ξk )I{Ack }. Let U = {n + 1, n + 2, . . . , n + ⌊m− (ρ)⌋} ∪ {n + ⌊m+ (ρ)⌋, . . . , n + m + 1}. Then, E [S1 (n, m) + S2 (n, m)]4 4 1/4 X ξk + E (ξ˜k X ≤ E 1/4 k∈U 4 1/4 X − ξk )I{Ak } + E (ξ˜k k∈U k∈U It follows from the definition of the Ak that E X k∈U 4 1/4 − ξk )I{Ack } . 4 (ξ˜k − ξk )I{Ak } ≤ C9,5 . Moreover, by (5.4), (5.3) and Minowski’s and Hölder’s inequalities we obtain that E ≤ 4 1/4 − ξk )I{Ack } X (ξ˜k X (E(ξ˜k − ξk )4+δ )1/(4+δ) (P {Ack })δ/(4+δ) ≤ C9,6 k∈U k∈U ≤ E (ξ˜k − ξk )4 I{Ack } X k∈U 1/4 X 1≤k<∞ k −µδ/(4+δ) ≤ C9,7 . Observe that by the stationarity of the ξk , E X k∈U 4 ξk = E X X ξk + 1≤k≤m− (ρ) m+ (ρ)≤k≤m+1 Arguing as before, we estimate E X X ξk + 1≤k≤m− (ρ) m+ (ρ)≤k≤m+1 Since the partial sums ξ˜k X 4 1/4 ξk ≤ E X ξ˜k m+ (ρ)≤k≤m+1 ξk . ξ˜k + 1≤k≤m+ (ρ) X and 1≤k≤m− (ρ) 4 X m+ (ρ)≤k≤m+1 4 1/4 ξk +C9,8 . are independent, if m ≥ m0 with some m0 and E ξ˜k = 0, we obtain that E X ξ˜k + 1≤k≤m− (ρ) = E X X m+ (ρ)≤k≤m+1 1≤k≤m− (ρ) 4 ξ˜k + E 4 ξ˜k X m+ (ρ)≤k≤m+1 4 ξ˜k + 6E 21 X 1≤k≤m− (ρ) 2 ξ˜k E X m+ (ρ)≤k≤m+1 2 ξ˜k . Using the induction assumption we get 4 m2 E ξ˜k ≤ C9,1 4 1≤k≤m− (ρ) X X 6E 1≤k≤m− (ρ) 2 ξ˜k E 2 m ξ˜k ≤ C9,1 . 4 m+ (ρ)≤k≤m+1 and E Lemma 5.7 yields 4 X 2 2 2 m . ξ˜k ≤ 6C7,1 4 m+ (ρ)≤k≤m+1 X 2 Since we can assume that C9,1 ≥ 6C7,1 , we conclude E ξ˜k + X 1≤k≤m− (ρ) thus completing the proof. 4 3 ξ˜k ≤ C9,1 m2 , 4 m+ (ρ)≤k≤m+1 X 2 Lemmas 5.7–5.9 contain upper and lower bounds for the second and fourth moments of the increments of the partial sums of the ỹi2 − E ỹi2. As we will see, these inequalities together with the independence of the ỹj and ỹi, where i < j −j ρ , are sufficient to establish the strong approximation in Theorem 2.4. According to Lemma 5.4 it suffices to approximate ηρ < τ < η − 1 and set ξ˜i X Xk = and k η ≤i≤(k+1)η −(k+2)τ Yk = (ỹi2 − E ỹi2 ). Let η > 1 + ρ, P ξ˜i , X (k+1)η −(k+2)τ <i<(k+1)η where ξ˜i = ỹi2 − E ỹi2 . Then {Xk } and {Yk } are sequences each consisting of independent random variables. To show that Xk and Xk+1 are independent, consider the closest lag, i.e. the index difference of the first ξ˜i in Xk+1 and of the last ξ˜i in Xk . We obtain that (k + 1)η − [(k + 1)η − (k + 2)τ ] > (k + 1)τ > (k + 1)ηρ which implies the assertion. Similar arguments apply to {Yk }. Finally, let (5.8) r(N) = E X 1≤k≤N 2 Xk and r∗ (N) = E X 1≤k≤N 2 Yk Lemma 5.10 If the conditions of Theorem 2.4 are satisfied, then there is a Wiener process {W1 (t) : t ≥ 0} such that X 1≤k≤N Xk − W1 (r(N)) = o r(N)(κ+1)/4 log r(N) for any (2η − 1)/(2η) < κ < 1. 22 a.s. Proof. It follows from the definition of ξ˜i that X1 , X2 , . . . are independent random variables with mean 0. By Lemmas 5.7 and 5.8, there are C10,1 and C10,2 such that EXk2 ≤ C7,1 ((k + 1)η − (k + 2)τ − k η ) ≤ C10,1 k η−1 and (5.9) EXk2 ≥ C8,1 ((k + 1)η − (k + 2)τ − k η ) ≥ C10,2 k η−1 . Using Lemma 5.9, we can find C10,3 such that (5.10) EXk4 ≤ C9,1 ((k + 1)η − (k + 2)τ − k η )2 ≤ C10,3 k 2(η−1) . Now, (5.9) and (5.10) imply X 1≤k<∞ X 1≤i≤k −2κ EXi2 EXk4 ≤ C10,4 X 1≤k<∞ k 2(η−1)−2κη < ∞ and the result follows from Theorem A. 2 Lemma 5.11 If the conditions of Theorem 2.4 are satisfied, then there is a Wiener process {W2 (t) : t ≥ 0} such that X 1≤k≤N Yk − W2 (r∗ (N)) = o r∗ (N)(κ∗ +1)/4 log r∗ (N) a.s. for any (2τ + 1)/(2(1 + τ )) < κ∗ < 1. Proof. Following the proof of Lemma 5.10, we have EYk2 ≥ C11,1 k τ and EYk4 ≤ C11,2 k 2τ . Hence, −2κ∗ X 1≤k<∞ X 1≤i≤k EYi2 EYk4 ≤ C11,3 X 1≤k<∞ k 2τ −2(τ +1)κ∗ < ∞, so the assertion follows from Theorem A. 2 The next lemma gives an upper bound for the increments of P 2 i≤n (ỹi − E ỹi2 ). Lemma 5.12 If the conditions of Theorem 2.4 are satisfied, then X ˜ max ξ i k η ≤j≤(k+1)η η j≤i≤(k+1) = O r(k)(κ+1)/4 log r(k) where κ is specified in Lemma 5.10. 23 a.s., Proof. On account of Lemma 5.9 we can apply Theorem 12.2 of Billingsley (1968, p. 94), resulting in P So, X ˜i ξ max kη ≤j≤(k+1)η j≤i≤(k+1)η P ≥λ X ˜i max ξ kη ≤j≤(k+1)η j≤i≤(k+1)η ≤ C12,1 C12,2 2(η−1) η η 2 ((k + 1) − k ) ≤ k . λ4 λ4 ≥ k η(κ+1)/4 log k Since (η − 1)/η < (2η − 1)/(2η) < κ, we obtain that X 1≤k<∞ P X ˜i max ξ kη ≤j≤(k+1)η j≤i≤(k+1)η ≤ C12,2 k 2(η−1)−η(κ+1) . ≥ k η(κ+1)/4 log k <∞ and therefore using the Borel–Cantelli lemma the proof is complete. 2 For any real n ≥ 1 (not necessarily an integer), we introduce the notations Tn = E X 1≤k≤n 2 ξk and T̃n = E X 1≤k≤n 2 ξ˜k . The final lemma contains certain properties of the quantities Tn and T̃n . Lemma 5.13 Let Nk = (k + 1)η . Then we have, with some constant C > 0, (i) |T̃Nk − r(k)| ≤ Cr(k)1/2 r∗ (k)1/2 for Nk < n ≤ Nk+1 , (ii) |T̃n − T̃Nk | ≤ Ck (2η−1)/2 for Nk < n ≤ Nk+1 , √ (iii) |Tn − T̃n | ≤ C n for n ≥ 1. 1/2 Proof. By Minkowski’s inequality, we have |T̃Nk − r(k)1/2 | ≤ r∗ (k)1/2 and thus 1/2 |T̃Nk − r(k)| ≤ r∗ (k)1/2 (T̃Nk + r(k)1/2 ) ≤ r∗ (k)1/2 (2r(k)1/2 + r∗ (k)1/2 ), proving (i). The proof of (ii) is similar, using Lemma 5.7. Finally, to prove (iii) we apply Minkowski’s inequality again to get |Tn1/2 − T̃n1/2 | ≤ n X k=1 E[ξk − ξ˜k ]2 1/2 and, using the independence of εk and σk2 − σ̃k2 and Lemma 5.5, we obtain E[ξk − ξ˜k ]2 1/2 1/2 ≤ 2 E[σk2 − σ̃k2 ]2 E[ε4k ] 24 = O k −µ/4 . As µ can be chosen arbitrarily large, we get Tn1/2 − T̃n1/2 = O(1), which implies (iii), since T̃n = O(n) by Lemma 5.7. 2 Finally, we are in a position to give the proof of Theorem 2.4. Proof of Theorem 2.4. Putting together Lemmas 5.10, 5.11 and the law of the iterated logarithm for W2 yields X (5.11) 1≤i≤(N +1)η ξ˜i − W1 (r(N)) = O r(N)(κ+1)/4 log r(N) a.s. provided the constants involved are chosen according to the side conditions (5.12) η(κ + 1) > 2(τ + 1) 2η − 1 < κ < 1, 2η and which assure that the sum of the Yk is of smaller order of magnitude than the remainder term in Lemma 5.10. Observe that r(N) ≈ N η and r∗ (N) ≈ N τ +1 , where ≈ means that the ratio of the two sides is between positive constants. Letting now St∗ = ξ˜1 + . . . ξ˜⌊t⌋ , t ≥ 0, we get for Nk−1 < n ≤ Nk , using Lemmas 5.12, 5.13, relation (5.11) and standard estimates for the increments of Wiener processes (see, e.g., Theorem 1.2.1 of Csörgő and Révész (1981, p. 30)), ∗ ∗ |Sn∗ − W1 (Tn )| ≤ |Sn∗ − SN | + |SN − W1 (r(k))| k k +|W1 (r(k)) − W1 (T̃Nk )| + |W1 (T̃Nk ) − W1 (T̃n )| + |W1 (T̃n ) − W1 (Tn )| = O r(k)(κ+1)/4 log r(k) + O r(k)(κ+1)/4 log r(k) +O r(k)1/4 r∗ (k)1/4 log k + O k (2η−1)/4 log k + O n1/4 log n (5.13)= O n(κ+1)/4 log n + O n(η+τ +1)/4η log n + O n(2η−1)/4η log n + O n1/4 log n . In the last step we expressed the remainder terms with n, using the fact that k ≈ n1/η . Choose now η = 3/2, τ near 0 and κ very near to, but larger than, 2/3. In this case the relations (5.12) are satisfied and hence (5.13) yields (5.14) Sn∗ − W1 (Tn ) = O n5/12+ε a.s. for any ε > 0. Now {ξk } is a stationary sequence with zero mean and covariances E(ξ0 ξk ) = O(k −µ ) for any µ > 0, from which easily follows that Tn = E X 1≤k≤n 2 ξk = nσ̄ 2 + O(1), where σ̄ 2 is defined in (2.13). Thus, on applying Theorem 1.2.1 of Csörgő and Révész (1981, p. 30), we arrive at W1 (Tn ) = W1 (nσ̄ 2 ) + O(log n) a.s., so Theorem 2.4 is readily proved on setting W (t) = σ̄ −1 W1 (σ̄ 2 t). 2 25 References [1] Aue, A., and Horváth, L. (2004). Delay time in sequential detection of change. Statistics and Probability Letters 67, 221–231. [2] Aue, A., and Horváth, L. (2005). A note on the existence of a stationary augmented GARCH process. Preprint, University of Utah. [3] Aue, A., Horváth, L., Hušková, M., and Kokoszka, P. (2005). Change–Point Monitoring in Linear Models with Conditionally Heteroskedastic Errors. Preprint, University of Utah. [4] Berkes, I., Gombay, E., Horváth, L., and Kokoszka, P. (2004). Sequential change– point detection in GARCH(p, q) models. Econometric Theory 20, 1140–1167. [5] Berkes, I., and Horváth, L. (2001). Strong approximation for the empirical process of a GARCH sequence. The Annals of Applied Probability 11, 789–809. [6] Berkes, I., Horváth, L., Kokoszka, P., and Shao, Q.–M. (2005). On the almost sure convergence of the Bartlett estimator. Periodica Mathematica Hungarica (to appear). [7] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York. [8] Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327. [9] Bougerol, P., and Picard, N. (1992). Stationarity of GARCH processes and of some nonnegative time series. Journal of Econometrics 52, 115–127. [10] Brandt, A. (1986). The stochastic equation Yn+1 = An Yn + Bn with stationary coefficients. Advances in Applied Probability 18, 211–220. [11] Carrasco, M., and Chen, X. (2002). Mixing and moment properties of various GARCH and stochastic volatility models. Econometric Theory 18, 17–39. [12] Chow, Y.S., and Teicher, H. (1988). Probability Theory (2nd ed.). Springer, New York. [13] Csörgő, M., and Horváth, L. (1997). Limit Theorems in Change–Point Analysis. Wiley, Chichester. 26 [14] Csörgő, M., and Révész, P. (1981). Strong Approximations in Probability and Statistics. Academic Press, New York. [15] Darling, D.A., and Erdős, P. (1956). A limit theorem for the maximum of normalized sums of independent random variables. Duke Mathematical Journal 23, 143–155. [16] Ding, Z., Granger, C.W.J., and Engle, R. (1993). A long memory property of stock market returns and a new model. Journal of Empirical Finance 1, 83–106. [17] Duan, J.–C. (1997). Augmented GARCH(p,q) process and its diffusion limit. Journal of Econometrics 79, 97–127. [18] Durrett, R. (1991). Probability: Theory and Examples. Wadsworth & Brooks/Cole, Pacific Grove, CA. [19] Eberlein, E. (1986). On strong invariance principles under dependence assumptions. The Annals of Probability 14, 260–270. [20] Engle, R.F. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 50, 987–1008. [21] Engle, R., and Ng, V. (1993). Measuring and testing impact of news on volatility. Journal of Finance 48, 1749–1778. [22] Geweke, J. (1986). Modeling persistence of conditional variances: a comment. Econometric Reviews 5, 57–61. [23] Giraitis, L., Kokoszka, P., Leipus, R., and Teyssiére, G. (2003). Rescaled variance and related tests for long memory in volatility and levels. Journal of Econometrics 112, 265–294. [24] Glosten, L., Jagannathan, R., and Dunke, D. (1993). Relationship between expected value and the volatility of the nominal excess return on stocks. Journal of Finance 48, 1779–1801. [25] Hardy, G.H., Littlewood, J.E., and Pólya, G. (1959). Inequalities (2nd ed.). Cambridge University Press, Cambridge, U.K. [26] Horváth, L., Kokoszka, P., and Zhang, A. (2005). Monitoring constancy of variance in conditionally heteroskedastic time series. Econometric Theory, to appear. 27 [27] Horváth, L., and Steinebach, J. (2000). Testing for changes in the mean or variance of a stochastic process under weak invariance. Journal of Statistical Planning and Inference 91, 365–376. [28] Kuelbs, J., and Philipp, W. (1980). Almost sure invariance principles for partial sums of mixing B–valued random variables. The Annals of Probability 8, 1003–1036. [29] Lo, A. (1991). Long–run memory in stock market prices. Econometrica 59, 1279– 1313. [30] Nelson, D.B. (1990). Stationarity and persistence in the GARCH(1,1) model. Econometric Theory 6, 318–334. [31] Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59, 347–370. [32] Petrov, V.V. (1995). Limit Theorems of Probability Theory. Clarendon Press, Oxford, U.K. [33] Révész, P. (1990). Random walk in random and nonrandom environments. World Scientific Publishing Co., Teaneck, NJ. [34] Schwert, G.W. (1989). Why does stock market volatility change over time? Journal of Finance 44, 1115–1153. [35] Strassen, V. (1965). Almost sure behaviour of sums of independent random variables and martingales. In: Proceedings of the Fifth Berkeley Symposium in Mathematical Statistics and Probability 2, 315–344. [36] Straumann, D., and Mikosch, T. (2005). Quasi–maximum–likelihood estimation in heteroscedastic time series: a stochastic recurrence equations approach. The Annals of Statistics, to appear. [37] Taylor, S. (1986). Modelling Financial Time Series. Wiley, New York. [38] Zakoian, J.–M. (1994). Threshold heteroskedastic models. Journal of Economic Dynamics and Control 18, 931–955. 28