Strong Approximation for the Sums of Squares of Augmented GARCH Sequences

advertisement
Strong Approximation for the Sums of Squares of
Augmented GARCH Sequences
Alexander Aue
1
István Berkes
2
Lajos Horváth
3
Abstract: We study so–called augmented GARCH sequences, which include many submodels of considerable interest, such as polynomial and exponential GARCH. To model
the returns of speculative assets, it is particularly important to understand the behaviour
of the squares of the observations. The main aim of this paper is to present a strong
approximation for the sum of the squares. This will be achieved by an approximation
of the volatility sequence with a sequence of blockwise independent random variables.
Furthermore, we derive a necessary and sufficient condition for the existence of a unique
(strictly) stationary solution of the general augmented GARCH equations. Also, necessary
and sufficient conditions for the finiteness of moments are provided.
AMS 2000 Subject Classification: Primary 60F15, Secondary 62M10.
Keywords and Phrases: augmented GARCH processes, stationary solutions, strong
approximation, moments, partial sums
1
Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112–
0090, USA, email: aue@math.utah.edu
2
Institut für Statistik, Technische Universität Graz, Steyrergasse 17/IV, A–8010 Graz, Austria, email:
berkes@stat.tu-graz.ac.at
3
Department of Mathematics, University of Utah, 155 South 1440 East, Salt Lake City, UT 84112–
0090, USA, email: horvath@math.utah.edu
Corresponding author: A. Aue, phone: +1–801–581–5231, fax: +1–801–581–4148.
Research partially supported by NATO grant PST.EAP.CLG 980599 and NSF–OTKA grant INT–
0223262
1
Introduction
Starting with the fundamental paper of Engle (1982), autoregressive conditionally heteroskedastic ARCH processes and their generalizations have played a central role in finance
and macroeconomics. One important class of these generalizations can be subsumed under
the notion of augmented GARCH(1,1) processes introduced by Duan (1997). He studied
random variables {yk : −∞ < k < ∞} satisfying the equations
(1.1)
yk = σk εk
and
(1.2)
2
Λ(σk2 ) = c(εk−1 )Λ(σk−1
) + g(εk−1),
where Λ(x), c(x) and g(x) are real–valued functions. To solve for σk2 , the definitions in
(1.1) and (1.2) require that
(1.3)
Λ−1 (x) exists.
Throughout the paper we assume that
(1.4)
{εk : −∞ < k < ∞} is a sequence of
independent, identically distributed random variables.
We start by giving a variety of examples which are included in the framework of the
augmented GARCH model. Since all submodels must satisfy (1.1), only the corresponding
specific equations (1.2) are stated. Moreover, it can be seen that (1.3) always holds true.
Example 1.1 (GARCH) Bollerslev (1986) introduced the process
2
2
σk2 = ω + βσk−1
+ αyk−1
h
i
2
= ω + β + αε2k−1 σk−1
.
Example 1.2 (AGARCH) This asymmetric model for volatility was defined by Ding,
Granger and Engle (1993) as
2
σk2 = ω + βσk−1
+ α(|yk−1| − µyk−1)2
i
h
2
.
= ω + β + α(|εk−1| − µεk−1 )2 σk−1
Example 1.3 (VGARCH) Engle and Ng (1993) used the equations
2
σk2 = ω + βσk−1
+ α(εk−1 − µ)2
to study the impact of news on volatility.
2
Example 1.4 (NGARCH) This nonlinear asymmetric model was also introduced by
Engle and Ng (1993):
2
2
σk2 = ω + βσk−1
+ α(εk−1 − µ)2 σk−1
.
Example 1.5 (GJR–GARCH) The model of Glosten, Jagannathan and Runke (1993)
is given by
2
2
2
σk2 = ω + βσk−1
+ α1 yk−1
I{yk−1 < 0} + α2 yk−1
I{yk−1 ≥ 0}
h
i
2
= ω + β + α1 ε2k−1 I{εk−1 < 0} + α2 ε2k−1 I{εk−1 ≥ 0} σk−1
.
Example 1.6 (TSGARCH) This is a modification of Example 1.1 studied by Taylor
(1986).
σk = ω + βσk−1 + α1 |yk−1|
= ω + [β + α1 |εk−1|] σk−1 .
Example 1.7 (PGARCH) Motivated by a Box–Cox transformation of the conditional
variance, Carrasco and Chen (2002) extensively studied the properties of the process
satisfying
δ
σkδ = ω + βσk−1
+ α|yk−1|δ
i
h
δ
.
= ω + β + α|εk−1|δ σk−1
Example 1.8 (TGARCH) Example 1.5 is a nonsymmetric version of the GARCH process of Example 1.1. Similarly, a nonsymmetric version of Example 1.7 is
δ
+ α1 |yk−1|δ I{yk−1 < 0} + α2 |yk−1|δ I{yk−1 ≥ 0}
σkδ = ω + βσk−1
h
i
δ
= ω + β + α1 |εk−1|δ I{εk−1 < 0} + α2 |εk−1|δ I{εk−1 ≥ 0} σk−1
.
The special case of δ = 1 was proposed by Taylor (1986) and Schwert (1989) and includes
the threshold model of Zakoian (1994).
In Examples 1.1–1.8, Λ(x) = xδ , so these models are usually referred to as polynomial
GARCH. Next, we provide two examples for exponential GARCH.
Example 1.9 (MGARCH) The multiplicative model of Geweke (1986) is defined as
2
2
log σk2 = ω + β log σk−1
+ α log yk−1
2
= ω + (α + β) log σk−1
+ α log ε2k−1 .
3
Example 1.10 (EGARCH) The exponential GARCH model of Nelson (1991) is
2
log σk2 = ω + β log σk−1
+ α1 εk−1 + α2 |εk−1|.
Straumann and Mikosch (2005) obtained a necessary and sufficient condition for the
existence and uniqueness of the solution of the recursive equations in Examples 1.1, 1.2
and 1.10. Their method is based on the theory of iterated functions. Carrasco and Chen
(2002) studied the existence of solutions in the general framework of (1.1) and (1.2).
Since they were also interested in the mixing properties of the sequences {yk } and {σk2 },
they assumed that |c(0)| < 1, E|c(ε0 )| < ∞ and E|g(ε0)| < ∞. We show that these
conditions can be weakened to logarithmic moment conditions. The verification of the
mixing property of solutions to (1.1) and (1.2) is based on the theory of hidden Markov
models and geometric ergodicity, and it is also assumed that ε0 has a continuous density
which is positive on the whole real line. The exponential mixing proved by Carrasco and
Chen (2002) yields the weak convergence as well as the approximation of partial sums
of yi2 with Brownian motion. However, for these results to hold true it should not be
required to assume either the existence or smoothness of the density of ε0 .
The main aim of this paper is to provide a general method on how to obtain strong
invariance principles for partial sums of GARCH–type sequences. To this end, we show
that {σk2 } can be approximated with a sequence of random variables {σ̃k2 } which are defined
in a way that σ̃i2 and σ̃j2 are independent if the distance |i − j| is appropriately large. This
is a generalization of the observation in Berkes and Horváth (2001) that the volatility in
GARCH(p, q) models can be approximated with random variables which are blockwise
independent. It is also known that the general theory of mixing provides a poor bound
for the rate of convergence to Brownian motion. We connect this rate of convergence to
the finiteness of moments of ε0 . The general blocking method used throughout the proofs
yields strong invariance for the partial sums of the random variables {yk2 − Eyk2 } with very
sharp rates. Our method can, however, also be used to obtain strong approximations for
other functions of y1 , . . . , yn . Additionally, while we are focusing on processes of order
(1,1) for the sake of presentation, adaptations of our methods work also for sequences of
arbitrary order (p, q).
Strong approximations of partial sums play an important role in, for instance, deriving
results in asymptotic statistics. So, it has been pointed out by Horváth and Steinebach
(2000) that limit results for so–called CUSUM and MOSUM–type test procedures, which
are used to detect mean and variance changes, can be based on strong invariance principles. Theorem 2.4, in particular, might be used to test for the stability of the observed
volatilities. In addition, strong approximations can also be used to develope sequential
4
monitoring procedures (cf. Berkes et al. (2004), Aue and Horváth (2004), Aue et al. (2005),
and Horváth et al. (2005)).
The paper is organized as follows. First, we discuss the existence of a unique stationary
solution of the augmented GARCH model (1.1) and (1.2). We also find conditions for
the existence of the moments of Λ(σ02 ) (cf. Theorems 2.1–2.3 below). These results will
be utilized to provide strong approximations for the partial sums of {yk2 } in Theorem 2.4.
The proofs of Theorems 2.1–2.3 are given in Section 4, while the strong approximation of
Theorem 2.4 is verified in Section 5. In Section 3, we deal with applications of Theorem
2.4 in the field of change–point analysis.
2
Main results
Solving (1.1) and (1.2) backwards, we get that, for any N ≥ 1,
(2.1)
2
Λ(σk2 ) = Λ(σk−N
)
Y
c(εk−i) +
1≤i≤N
X
g(εk−i)
1≤i≤N
Y
c(εk−j ).
1≤j≤i−1
This suggests that the general solution of (1.1) and (1.2) is given by
(2.2)
Λ(σk2 ) =
X
g(εk−i)
1≤i<∞
Y
c(εk−j ).
1≤j≤i−1
Our first result gives a necessary and sufficient condition for the existence of the random
variable
X
Y
X=
g(ε−i)
(2.3)
c(ε−j ).
1≤i<∞
1≤j≤i−1
Let log+ x = log(max{x, 1}).
Theorem 2.1 We assume that (1.1)–(1.4) hold,
(2.4)
(i) If
(2.5)
E log+ |g(ε0)| < ∞
and
E log+ |c(ε0 )| < ∞.
E log |c(ε0)| < 0,
then the sum in (2.3) is absolutely convergent with probability one.
(ii) If
(2.6)
P {g(ε0) = 0} < 1
and the sum in (2.3) is absolutely convergent with probability one, then (2.5) holds.
5
We note that in (2.5) we allow that E log |c(ε0)| = −∞. The result bears some resemblence with Theorem 1.3 in Bougerol and Picard (1992), who concentrate, however, on
positive processes and put more emphasis on the problem of irreducibility. The proof of
Theorem 2.1 is given in Section 4.
D
Since under stationarity Λ(σk2 ) = X, the next result gives conditions for the existence
of E|X|ν as well as of E|Λ(σk2 )|ν .
Theorem 2.2 We assume that (1.1)–(1.4), (2.5) hold, ν > 0 and
(2.7)
E|g(ε0)|ν < ∞.
(i) If
(2.8)
E|c(ε0 )|ν < 1,
then
(2.9)
E|X|ν < ∞.
(ii) If (2.6) and (2.9) are satisfied,
(2.10)
c(ε0 ) ≥ 0
and
g(ε0) ≥ 0,
then (2.8) holds.
We note that in Examples 1.1–1.8 the left–hand side of the equations defining Λ(σk2 )
is always positive, so the right–hand side of those equations must also be positive. This
requirement restricts the values of possible parameter choices such that (2.10) always
holds in these examples. We also point out that Theorem 2.2 is an extension of the
results in Nelson (1990) on the existence of the moments of a GARCH(1,1) sequence.
The next result shows that (2.2) is the only solution of (1.1) and (1.2). We say that
a strictly stationary solution of (1.1) and (1.2) is nonanticipative if, for any k, the random variable yk is independent of {εi : i > k}. According to Brandt (1986), the strictly
stationary solution of (1.1) and (1.2) in part (i) of the following theorem is always nonanticipative. The same is true under the assumptions of part (ii) of the same theorem.
Theorem 2.3 We assume that (1.1)–(1.4) and (2.4) are satisfied.
(i) If (2.5) holds, then (2.2) is the only stationary solution of (1.1) and (1.2).
(ii) If in addition (2.6), (2.10) hold and (1.1), (1.2) has a stationary, nonnegative
solution, then (2.5) must be satisfied.
6
Theorem 2.3 gives, in particular, a necessary and sufficient condition for the existence
of polynomial GARCH(1,1) and thus generalizes the result of Nelson (1990). The special
case of EGARCH introduced in Example 1.10 is also covered by Straumann and Mikosch
(2005). Here, stationarity can be characterized as follows. If |β| < 1 and E log+ |α1 ε0 +
α2 |ε0 || < ∞, then there is a unique stationary, nonanticipative process satisfying the
equation in Example 1.10. If β > 0 and P {α1ε0 + α2 |ε0 | ≥ 0} = 1, then β < 1 is necessary
and sufficient for the existence of a stationary solution of the EGARCH equations. If we
allow anticipative solutions, then (1.1) and (1.2) might have a solution even if (2.5) fails,
i.e. if E log |c(ε0 )| ≥ 0 (cf. Aue and Horváth (2005)).
Now, we consider approximations for
S(k) =
X
1≤i≤k
(yi2 − Eyi2 ).
Towards this end, we need that Λ−1 (x) is a smooth function. We assume that
Λ(σ02 ) ≥ ω
(2.11)
Λ′
with some ω > 0,
exists and is nonnegative, and there are C, γ such that
(2.12)
1
′ −1
Λ (Λ (x)) ≤ Cxγ for all x ≥ ω.
Assumptions (2.11) and (2.12) hold for the processes in Examples 1.1–1.8, but fail for the
remaining Examples 1.9 and 1.10. Thus, more restrictive moment conditions are needed
in the exponential GARCH case (cf. Theorem 2.4(ii) below). Set
σ̄ 2 = Var y02 + 2
(2.13)
X
Cov(y02 , yi2).
1≤i<∞
Theorem 2.4 We assume that (1.1)–(1.4), (2.5) hold, σ̄ > 0,
E|ε0|8+δ < ∞
(2.14)
with some δ > 0.
(i) If (2.11) and (2.12) are satisfied and (2.7) holds for some ν > 4(1 + max{0, γ}),
then there is a Wiener process {W (t) : t ≥ 0} such that
(2.15)
X
1≤i≤n
(yi2 − Eyi2 ) − σ̄W (n) = o n5/12+ε
a.s.
for any ε > 0.
(ii) If Λ(x) = log x, we assume that E exp(t|g(ǫ0)|) exists for some t > 4 and
(2.16)
there is
0 < c < 1 such that |c(x)| < c,
then (2.15) holds.
7
The method of the proof of Theorem 2.4 can be used to establish strong approximations for the sums of functionals of the yi, too. For example, approximations can be
P
P
P
2
2
− E[yi2yi+k
]),
derived for 1≤i≤n (yi − Eyi ), 1≤i≤n (yiyi+k − E[yi yi+k ]) and 1≤i≤n (yi2yi+k
where k is a fixed integer. Examples of strong approximations for dependent sequences
other than augmented GARCH processes may be found in Eberlein (1986) and Kuelbs
and Philipp (1980). While we are using a blocking argument, the methods in the afore
mentioned papers are based on examining conditional expectations and variances, and
mixing properties, respectively.
3
Applications
In this section, we illustrate the usefulness of Theorem 2.4 with some applications coming from change–point analysis. One of the main concerns in econometrics is to decide
whether the volatility of the underlying variables is stable over time or if it changes in the
observation period. While, in general, this allows to include larger classes of random processes, we will focus on augmented GARCH sequences here. To accomodate the setting,
we assume that the volatilities {σk2 } follow a common pattern until an unknown time k ∗ ,
called the change–point. After k ∗ a different regime takes over. It will be demonstrated
in the following that, using Theorem 2.4, some of the basic detection methods can be
applied to the observed volatilities {yk2 }.
Application 3.1 The popular CUSUM procedure is based on the test statistics
X 2
1
k X 2 y .
Tn = √
yi −
max nσ̄ 1≤k≤n 1≤i≤k
n 1≤i≤n i Under the assumption of no change in the volatility it holds
(3.17)
D
Tn −→ sup |B(t)|,
0≤t≤1
where {B(t) : 0 ≤ t ≤ 1} denotes a Brownian bridge. Theorem 2.4 not only gives an
alternative way to prove (3.17), but also provides the strong upper bound O(n−1/12+ε ) for
the rate of convergence.
Application 3.2 A modification of Application 3.1 is given by the so–called MOSUM
procedure. (Therein, MOSUM is the abbreviation for moving sum.) Its basic properties
have been collected in Csörgő and Horváth (1997, p. 181). In our case, the test statistic
becomes
X
X
a(n)
1
2
2
max yi .
yi −
Vn = q
n 1≤i≤n a(n)σ̄ 1≤k≤n−a(n) k<i≤k+a(n)
8
Let, in addition to the assumptions of Theorem 2.4, the conditions
q
n5/12+ε log(n/a(n))
(3.18)
q
a(n)
= O(1)
for some τ > 0
and a(n)/n → 0, as n → ∞, be satisfied. Then, we obtain the following extreme value
asymptotic for Vn :
(
!
!)
and
D(x) = 2 log x +
n
n
lim P A
Vn ≤ t + D
n→∞
a(n)
a(n)
(3.19)
= exp −e−t
for all t, where
A(x) =
q
2 log x
1
1
log log x − log π.
2
2
We shall give a proof of (3.19) now. Indeed, on using Theorem 2.4, we obtain
1
(3.20) Vn = q
a(n)
max W (k
1≤k≤n−a(n) 

a(n)
n5/12+ε 
+ a(n)) − W (k) −
W (n) + OP  q
n
a(n)
as n → ∞. The moduli of continuity of the Wiener process (cf. Csörgő and Révész (1981,
p. 30) yield, as n → ∞,
1
max |W (k + a(n)) − W (k)|
a(n) 1≤k≤n−a(n)
(3.21)
q
1
sup |W (t + a(n)) − W (t)| + OP
= q
a(n) 0≤t≤n−a(n)
s
log n
a(n)
!
By the scale transformation we obtain for the functional of the Wiener process on the
right–hand side of the latter equation
(3.22)
1
D
sup
q
a(n) 0≤t≤n−a(n)
|W (t + a(n)) − W (t)| =
sup
0≤s≤n/a(n)−1
|W (s + 1) − W (s)|.
Hence, Theorem 7.2.4 of Révész (1990, p. 72) gives the following extreme value asymptotic
for the Wiener process. It holds,
(
n
(3.23) lim P A
n→∞
a(n)
!
n
sup
|W (s + 1) − W (s)| ≤ t + D
a(n)
0≤s≤n/a(n)−1
!)
On combining (3.20)–(3.22) with condition (3.18), we arrive at
!
n
n
D
A
Vn = A
a(n)
a(n)
!
sup
0≤s≤n/a(n)−1
9
= exp e−t .
|W (s + 1) − W (s)| + oP (1),
so that (3.19) follows from (3.23).
Application 3.3 It has been pointed out by Csörgő and Horváth (1997) that weighted
versions of the CUSUM statistics have a better power for detecting changes that occur
either very early or very late in the observation period. Therefore, Tn from Application
3.1 can be transformed to the weighted statistics
1
Tn,α = √
nσ̄
α X
n max
yi2
1≤k≤n k
1≤i≤k
−
k
n
2 yi ,
1≤i≤n
X
where 0 ≤ α < 1/2. Under the assumptions of Theorem 2.4 it holds
D
|B(t)|
tα
0<t<1
Tn,α −→ sup
(3.24)
(n → ∞),
where {B(t) : 0 ≤ t ≤ 1} is a Brownian bridge. Theorem 2.4 and the moduli of continuity
of the Wiener process (cf. Csörgő and Révész (1981, p. 30)) imply that
X
(3.25)
(yi2 − Ey02) − σ̄W (t) = o t5/12+ε
1≤i≤t
a.s.
as t → ∞. Hence,
1
1
√
sup α
n 1/n≤t≤1 t
= OP
− Ey02) − σ̄W (nt)

!
−1/12+ε

5/12+ε

O
n
P
(nt)
=
α

t
 OP nα−1/2
X
(yi2
1≤i≤nt
1
√
sup
n 1/n≤t≤1
giving that
D
if α ≤ 5/12 + ε,
if 5/12 + ε < α < 1/2,
1
|W (t) − tW (1)| + oP (1).
α
1/n≤t≤1 t
Tn,α = sup
By the almost sure continuity of t−α W (t), we obtain
1
1
|W
(t)
−
tW
(1)|
−→
sup
|W (t) − tW (1)|
α
α
0<t≤1 t
1/n≤t≤1 t
sup
a.s.
Since {W (t) − tW (1) : 0 ≤ t ≤ 1} and {B(t) : 0 ≤ t ≤ 1} have the same distribution
(3.24) holds.
Application 3.4 Instead of the weighted sup–norm, also weighted L2 functionals can
be used. Let
2

Z 1
X 2
1 X 2
1
yi  dt.
y −t
Zn =
nσ̄ 2 0 t 1≤i≤nt i
1≤i≤n
10
Under the conditions of Theorem 2.4 we have that
D
Zn −→
(3.26)
Z
1
0
B 2 (t)
dt
t
(n → ∞),
where {B(t) : 0 ≤ t ≤ 1} denotes a Brownian bridge. Using (3.25), we conclude
1Zt1
(W (nt) − tW (n))2 dt + OP n−1/12+ε .
Zn =
n 0 t
So, with B(t) = n−1/2 (W (nt) − tW (n)), we immediately get (3.26).
For on–line monitoring procedures designed to detect changes in volatility confer Aue
et al. (2005). All applications contain the parameter σ̄, which is in general unknown
and so it has to be estimated from the sample data y1 , . . . , yn . This can be achieved
easily by imposing the method developed by Lo (1991). For further results concerning
the estimation of the variance of sums of dependent random variables we refer to Giraitis
et al. (2003) and Berkes et al. (2005).
4
Proofs of Theorems 2.1–2.3
Proof of Theorem 2.1. Part (i) is an immediate consequence of Brandt (1986).
The first step in the proof of (ii) is the observation that by (2.6) there is a constant
a > 0 such that P {|g(ε0)| > a} > 0. Define the events
Ai =


|g(ε−i)| ≥ a,

Y
1≤j≤i−1



We introduce the σ–algebras
F0 = {∅, Ω}
(4.1)
and
i ≥ 1.
|c(ε−j )| ≥ 1 ,
Fi = σ({εj : −i ≤ j ≤ 0}),
i ≥ 1.
Clearly, Ai ∈ Fi for all i ≥ 1. Also, by standard arguments,
X
1≤i<∞
P {Ai |Fi−1 } = P {|g(ε−i)| ≥ a}
= P {|g(ε0)| ≥ a}
We claim that
X
1≤i<∞


I
X
1≤j≤i−1
X
P
1≤i<∞
X
1≤i<∞


Y

1≤j≤i−1

 X
I


1≤j≤i−1
log |c(ε−j )| ≥ 0 = ∞
11
|c(ε−j )| ≥


1Fi−1



log |c(ε−j )| ≥ 0 .
a.s.
If E log |c(ε0 )| > 0, then this follows from the strong law of large numbers, while the
Chung–Fuchs law (cf. Chow and Teicher (1988, p. 148)) applies if E log |c(ε0)| = 0. Hence,
Corollary 3.2 of Durrett (1986, p. 240) gives P {Ai infinitely often } = 1, and therefore


P  lim g(ε−i)
i→∞
Y
1≤j≤i−1


c(ε−j ) = 0 = 0.
This contradicts that the sum in (2.3) is finite with probability one.
2
Proof of Theorem 2.2. If ν ≥ 1, then by Minkowski’s inequality we have
E|X|ν ≤
ν 1/ν ν
Y
 X  
E g(ε−i)
c(ε−j )  

1≤i<∞
1≤j≤i−1



= E|g(ε0)|ν 
X
1≤i<∞
ν
[E|c(ε0 )|ν ](i−1)/ν  < ∞
by assumptions (2.7) and (2.8). If 0 < ν < 1, then (cf. Hardy, Littlewood and Pólya
(1959))
E|X|ν ≤
ν
Y
E g(ε−i)
c(ε−j )
1≤i<∞
1≤j≤i−1
= E|g(ε0)|ν
X
X
1≤i<∞
[E|c(ε0 )|ν ]i−1 < ∞,
completing the proof of the first part of Theorem 2.2.
If ν ≥ 1, then by the lower bound in the Minkowski inequality (cf. Hardy, Littlewood
and Pólya (1959)) and (2.10) we get
E|X|ν ≥
X
1≤i<∞

E g(ε−i)
Y
1≤j≤i−1
ν
c(ε−j ) = E[g(ε0 )]ν
X
(E[c(ε0 )]ν )i−1 .
1≤i<∞
Since by (2.6) and (2.7) we have 0 < E[g(ε0 )]ν < ∞, we get (2.8).
Similarly, if 0 < ν < 1, then

E|X|ν ≥ 


 X  
E g(ε−i)
1≤i<∞

= E[g(ε0 )]ν 
X
1≤i<∞
and therefore (2.8) holds.
Y
1≤j≤i−1
ν 1/ν ν

c(ε−j )  
ν
(E[c(ε0 )]ν )(i−1)/ν 
2
Proof of Theorem 2.3. The first part is an immediate consequence of Brandt (1986).
12
The second part is based on (2.1) with k = 0. Since by (2.10) all terms in (2.1) are
nonnegative, we get for all N ≥ 1 that
Λ(σ02 ) ≥
(4.2)
X
g(ε−i)
Y
c(ε−j ).
1≤j≤i−1
1≤i≤N
If E log c(ε0 ) ≥ 0, then the proof of Theorem 2.1(ii) gives that
Y
lim sup g(ε−i)
i→∞
1≤j≤i−1
c(ε−j ) ≥ a
with some a > 0. So by (2.10) we obtain that
lim
N →∞
X
g(ε−i)
Y
1≤j≤i−1
1≤i≤N
c(ε−j ) = ∞
a.s.,
contradicting (4.2)
5
2
Proof of Theorem 2.4
Since the proof of Theorem 2.4 is divided into a series of lemmas, we will first outline its
structure. The main ideas can be highlighted as follows.
• First goal of the proof is to show that the partial sums of {yk2} can be approximated
by the partial sums of a sequence of random variables {ỹk2} which are independent
at sufficiently large lags. To do so, we establish exponential inequalities for the
distance between the volatilities {σk2 } and a suitably constructed sequence {σ̃k2 }.
Consequently, letting ỹk = σ̃k εk , it suffices to find an approximation for the partial
P
sums nk=1 (ỹk2 − E ỹk2). Detailed proofs are given in Lemmas 5.1–5.4 below.
P
• Set ξ˜k = ỹk2 − E ỹk2 . The partial sum nk=1 ξ˜k will be blocked into random variables
belonging to two subgroups. The random variables associated with each subgroup
form an independent sequence which allows the application of the following simplified version of Theorem 4.4 in Strassen (1965, p. 334).
Theorem A. Let {Zi } be a sequence of independent centered random variables and
let 0 < κ < 1. If
r(N) =
X
1≤i≤N
EZi2 −→ ∞ (N → ∞),
X
1≤N <∞
r(N)−2κ EZN4 < ∞,
then there is a Wiener process {W (t) : t ≥ 0} such that
X
1≤i≤N
Zi − W (r(N)) = o r(N)(κ+1)/4 log r(N)
13
a.s.
(N → ∞).
Lemmas 5.5–5.9 help proving that the assumptions of Theorem A are satisfied, while
the actual approximations are given in Lemmas 5.10–5.12.
Let 0 < ρ < 1 and define σ̃k2 as the solution of
Λ(σ̃k2 ) =
X
g(εk−i)
1≤i≤k ρ
Y
c(εk−j ).
1≤j≤i−1
First, we obtain an exponential inequality for Λ(σk2 ) − Λ(σ̃k2 ).
Lemma 5.1 If (1.1)–(1.4), (2.5), (2.7) and (2.8) hold, then there are constants C1,1 ,
C1,2 and C1,3 such that
n
o
P |Λ(σk2 ) − Λ(σ̃k2 )| ≥ exp (−C1,1 k ρ ) ≤ C1,2 exp (−C1,3 k ρ ) .
Proof. Observe that
Λ(σk2 ) − Λ(σ̃k2 ) =
X
Y
g(εk−i)
k ρ +1≤i<∞
c(εk−j ) =
Y
2
c(εk−i)Λ(σk−k
ρ ).
1≤i≤k ρ
1≤j≤i−1
Using Theorem 2.7 of Petrov (1995) yields
P

 Y

1≤i≤k ρ
|c(εk−i)| ≥ exp (−C1,4 k ρ )



= P

 X

1≤i≤k ρ
(log |c(εk−i)| − a) ≥ −(C1,4 + a)k ρ
≤ 2 exp (−C1,5 k ρ ) ,



where E log |c(ε0 )| = a < 0 and 0 < C1,4 < −a. Theorem 2.2 implies
P |Λ(σk2) − Λ(σ̃k2 )| ≥ exp −
≤ P

 Y

1≤i≤k ρ
C1,4 ρ
k
2


2
|c(εk−i)| ≥ exp (−C1,4 k ρ ) + P |Λ(σk−k
ρ )| ≥ exp
≤ 2 exp (−C1,5 k ρ ) + exp −

C1,4 ρ
k
2
νC1,4 ρ
k E|Λ(σ02 )|ν ≤ C1,6 exp (−C1,7 k ρ ) .
2
This completes the proof.
2
Lemma 5.2 If (1.1)–(1.4), (2.5), (2.7), (2.8), (2.11) and (2.12) hold, then there are
constants C2,1 , C2,2 and C2,3 such that
n
o
P |σk2 − σ̃k2 | ≥ exp (−C2,1 k ρ ) ≤ C2,2 exp (−C2,3 k ρ ) .
14
Proof. By the mean–value theorem,
1
σk2 − σ̃k2 = Λ−1 (Λ(σk2 )) − Λ−1 (Λ(σ̃k2 )) =
Λ′ (Λ−1 (ξk ))
Λ(σk2 ) − Λ(σ̃k2 ) ,
where ξk is between Λ(σk2 ) and Λ(σ̃k2 ). By (2.12),
|σk2 − σ̃k2 | ≤ C |Λ(σk2 )|γ + |Λ(σ̃k2 )|γ Λ(σk2 ) − Λ(σ̃k2 ) .
(5.1)
If γ ≤ 0, then Lemma 5.2 follows from (2.11) in combination with Lemma 5.1. If γ > 0,
then Theorem 2.2 and Lemma 5.1 yield
P
|Λ(σk2)|γ
≤ P
+
|Λ(σ̃k2)|γ
Λ(σk2 ) − Λ(σ̃k2 )
n
Λ(σk2 ) − Λ(σ̃k2 )
ρ
≤ 2C1,2 exp (−C1,3 k ) + P
+P
|Λ(σk2)|
o
≥ exp (−C1,1 k ) + P
ρ
(
C1,1 ρ
≥ exp −
k
2
(
|Λ(σk2 )|ν
ρ
+ exp (−C1,1 k )
ν
|Λ(σk2 )|γ
C1,1 ρ
1
exp
k
≥
2
2
C1,1 ρ
1
exp
k
≥
2
2
+
|Λ(σ̃k2 )|γ
C1,1 ρ
≥ exp
k
2
ν/γ )
ν/γ )
≤ C2,4 exp (−C2,5 k ρ ) ,
finishing the proof.
2
Lemma 5.3 If (1.1)–(1.4), (2.5), (2.7), (2.8) hold and Λ(x) = log x, then there are
constants C3,1 and C3,2 such that
n
o
P |σk2 − σ̃k2 | ≥ exp (−C3,1 k ρ ) ≤ C3,2 k −ρν .
2
2
Proof. By the mean–value theorem, σk2 − σ̃k2 = elog σk − elog σ̃k = eξk (log σk2 − log σ̃k2 ), where
ξk is between log σk2 and log σ̃k2 . Hence,
2
2
|σk2 − σ̃k2 | ≤ elog σk + elog σ̃k | log σk2 − log σ̃k2 |.
(5.2)
On the set −1 < log σk2 − log σ̃k2 < 1, we have exp(log σ̃k2 ) ≤ 3 exp(log σk2 ), so by Theorem
2.2 and Lemma 5.1, we obtain
P
|σk2
−
σ̃k2 |
C1,1 ρ
≥ exp −
k
2
log σk2
≤ P 3e
| log σk2
−
log σ̃k2 |
C1,1 ρ
k
≥ exp −
2
15
+ C1,2 exp(−C1,3 k ρ )
n
≤ P | log σk2 − log σ̃k2 | ≥ exp (−C1,1 k ρ )
log σk2
+P 3e
C1,1 ρ
k
≥ exp
2
ρ
≤ C1,2 exp (−C1,3 k ) + P
ρ
≤ C1,2 exp (−C1,3 k ) + P
+ C1,2 exp(−C1,3 k ρ )
C1,1 ρ
≥
k − log 3 + C1,2 exp(−C1,3 k ρ )
2
log σk2
(
o
| log σk2 |ν
C1,1 ρ
≥
k − log 3
2
ν )
+ C1,2 exp(−C1,3 k ρ )
≤ C3,3 k −ρν .
This completes the proof.
2
Let
−∞ < i < ∞.
ỹi = σ̃i εi ,
It is clear that ỹi and ỹj are independent if i < j − j ρ . The next result shows that it is
enough to approximate the sum of the ỹi2 .
Lemma 5.4 If the conditions of Theorem 2.4 are satisfied, then
X
yi2
max 1≤k<∞ 1≤i≤k
Proof. Condition (2.14) implies
−
ỹi2
1≤i≤k
X
|εk | = o k 1/(8+δ)
as k → ∞. By Lemmas 5.2 and 5.3 we get
= O(1)
a.s.
a.s.
|σk2 − σ̃k2 | = O (exp(−C4,1 k ρ ))
a.s.
Thus we conclude
X 2
yi
max 1≤k<∞ 1≤i≤k
2
−
ỹi 1≤i≤k
X
≤
X yi2 − ỹi2 1≤i<∞
=
X σi2
1≤i<∞
− σ̃i2 ε2i
X
a.s.
a.s.
= O(1)
i2/(8+δ) exp(−C4,1 iρ ) = O(1),
1≤i<∞
and the proof is complete.
2
We note that under the conditions of Theorem 2.4 we have
(5.3)
o
n
P |σk2 − σ̃k2 | ≥ exp(−C4,2 k ρ ) ≤ C4,3 k −µ ,
where µ = ρν when Λ(x) = log x while µ can be chosen arbitrarily when (2.11) and (2.12)
hold. In both cases µ can be chosen arbitrarily large.
16
Lemma 5.5 If the conditions of Theorem 2.4 are satisfied, then
E(σ̃k2 )4+δ = O(1) with some δ > 0
(5.4)
and
E(σk2 − σ̃k2 )2 = O k −µ/2 ,
where µ is defined in (5.3).
Proof. We first show that E|σk2 − σ̃k2 |4+δ = O(1). Assume first that (2.11) and (2.12) are
satisfied. Using (5.1), we get in case of γ > 0 that
|σk2 − σ̃k2 |4+δ ≤ C |Λ(σk2 )|(4+δ)(1+γ) + |Λ(σ̃k2 )|(4+δ)(1+γ)
+|Λ(σk2)|(4+δ)γ |Λ(σ̃k2 )|4+δ +|Λ(σk2 )|4+δ |Λ(σ̃k2 )|(4+δ)γ .
It follows from our assumptions that E|Λ(σk2 )|(4+δ)(1+γ) ≤ C5,1 . Applying the defintion of
σ̃k2 , we obtain from the proof of Theorem 2.2

E|Λ(σ̃k2 )|(4+δ)(1+γ) ≤ E 
X
1≤i≤k ρ

≤ E
X
1≤i≤∞
|g(εk−i)|
|g(εk−i)|
Y
1≤j≤i−1
Y
1≤j≤i−1
(4+δ)(1+γ)
|c(εk−j )|
(4+δ)(1+γ)
|c(εk−j )|
≤ C5,2 .
Similarly,
|Λ(σk2 )|(4+δ)γ |Λ(σ̃k2)|4+δ

≤ 

≤ 
X
1≤i≤∞
X
1≤i≤∞
|g(εk−i)|
|g(εk−i)|
Y
1≤j≤i−1
Y
1≤j≤i−1
(4+δ)γ 
|c(εk−j )|

X
1≤i≤k ρ
|g(εk−i)|
Y
1≤j≤i−1
(4+δ)(1+γ)
4+δ
|c(εk−j )|
|c(εk−j )|
and therefore E|Λ(σk2 )|(4+δ)γ |Λ(σ̃k2)|4+δ ≤ C5,3 . The same argument implies in a similar
way E|Λ(σk2 )|4+δ |Λ(σ̃k2 )|(4+δ)γ ≤ C5,4 . If γ < 0, then (5.1) and (2.11) yield
4+δ
|σk2 − σ̃k2 |4+δ ≤ C5,5 Λ(σk2 ) − Λ(σ̃k2 )
and similar as before E|Λ(σk2 )|4+δ ≤ C5,6 as well as E|Λ(σ̃k2 )|4+δ ≤ C5,6 .
17
If Λ(x) = log x, then by (5.2) we have
2
2
|σk2 − σ̃k2 |4+δ ≤ C5,7 e(4+δ) log σk + e(4+δ) log σ̃k
| log σk2 |4+δ + | log σ̃k2 |4+δ .
So, it is enough to prove that
2
Eeη log σk ≤ C5,8
(5.5)
and
2
Eeη log σ̃k ≤ C5,8
with some η > 4 and C5,8 = C5,8 (η). By assumption (2.16),
X
Y
g(εk−i)
c(εk−j )
i≤i<∞
1≤j≤i−1
|Λ(σk2 )| ≤
≤
X
1≤i<∞
ci−1 |g(εk−i)|.
Now M(t) = E exp(t|g(ε0)|) exists for some t > 4 and therefore
E exp(t|Λ(σk2 )|) ≤
Y
1≤i<∞

M(tci−1 ) = exp 
X
1≤i<∞

log M(tci−1 ) .
Since 0 < c < 1, we see that tci → 0 as i → ∞. For any x > 0, log(1 + x) ≤ x and
M(x) = 1 + O(x) as x → 0, so we get the first part of (5.5). A similar argument applies
to the second statement.
For k ≥ 1, define the events Ak = {|σk2 − σ̃k2 | ≤ exp(−C5,9 k ρ )} and write
(σk2 − σ̃k2 )2 = (σk2 − σ̃k2 )2 I{Ak } + (σk2 − σ̃k2 )2 I{Ack }.
Clearly,
E[(σk2 − σ̃k2 )2 I{Ak }] = O (exp(−C5,9 k ρ )) .
Applying (5.3), (5.4) and the Cauchy–Schwarz inequality, we arrive at
h
i
E (σk2 − σ̃k2 )2 I{Ack } ≤ E|σk2 − σ̃k2 |4
1/2
P (Ack )1/2 = O k −µ/2 ,
completing the proof of Lemma 5.5.
2
Lemma 5.6 If the conditions of Theorem 2.4 are satisfied, then
X
Cov(y02, yk2 )
1≤k<∞
is absolutely convergent.
18
Proof. Since y0 and ỹk are independent for all k > 1, we obtain
E(y02 − Ey02 )(yk2 − Eyk2 ) = E(y02 − Ey02 )(yk2 − ỹk2 + ỹk2 − Eyk2 )
= E(y02 − Ey02 )(ỹk2 − Eyk2 ) + E(y02 − Ey02 )(yk2 − ỹk2)
= E(y02(yk2 − ỹk2)) − Ey02E(yk2 − ỹk2).
Since yk2 − ỹk2 = (σk2 − σ̃k2 )ε2k and the random variablesσk2 − σ̃k2 , ε2k are independent, Lemma
5.5 and Cauchy’s inequality give |Cov(y02, yk2 )| = O k −µ/4 , so Lemma 5.6 is proved on
account of µ/4 > 1.
2
Lemma 5.7 If the conditions of Theorem 2.4 are satisfied, then there is a constant C7,1
such that


2
E
for all n, m ≥ 1.

(ỹk2
n<k≤n+m
X
−
E ỹk2)
≤ C7,1 m
Proof. Set ξk = yk2 − Eyk2 and ξ˜k = ỹk2 − E ỹk2. It is easy to see that

E
X
n<k≤n+m
2
ξ˜k 
X
=
X
Eξk ξl +
n<k,l≤n+m
n<k,l≤n+m
E ξ˜k (ξ˜l − ξl ) +
X
n<k,l≤n+m
Eξl (ξ˜k − ξk ).
Lemma 5.5 and Cauchy’s inequality imply
X
n<k,l≤n+m
E ξ˜k (ξ˜l − ξl ) =
E ξ˜k2 E(ξ˜l − ξl )2
X
n<k,l≤n+m
X
=
E σ̃k4 Eε4k
n<k,l≤n+m
= O(1)
X
n<k,l≤n+m
1/2 1/2
Eε4l E(σl2 − σ̃l2 )2
l−µ/4 = O(1)m
X
n<l<∞
1/2
l−µ/4 = O(1)m.
Similarly,
X
n≤k,l≤n+m
E ξl (ξ˜k − ξk ) = O(1)m.
By the stationarity of {ξk : −∞ < k < ∞}, we have
X
Eξk ξl =
n≤k,l≤n+m
X
n≤k,l≤n+m
by an application of Lemma 5.6.
Eξ0 ξ|k−l| = O(1)m
2
19
Lemma 5.8 If the conditions of Theorem 2.4 are satisfied, then there are constants
n0 , m0 and C8,1 > 0 such that

X
E
2
(ỹk2 − E ỹk2) ≥ C8,1 m
n<k≤n+m
for all n ≥ n0 and m ≥ m0 .
Proof. Following the proof of Lemma 5.7, we get

2
 X
E
ξ˜k 
n<k≤n+m
Since
−
Eξk ξl n<k,l≤n+m
X
≤ C8,2 m
X
l−µ/4 .
n<l<∞
X
X
X
1
1
Eξk ξl =
Eξ0 ξ|k−l| −→ Var y02 + 2
Cov(y02, yk2) = σ̄ > 0,
m n<k,l≤n+m
m n<k,l≤n+m
1≤k<∞
the proof is complete.
2
Lemma 5.9 If the conditions of Theorem 2.4 are satisfied, then there is a constant C9,1
such that


4
(ỹk2
n<k≤n+m
X
E
(5.6)
for all n, m ≥ 1.
−
E ỹk2 )
≤ C9,1 m2
Proof. We prove (5.6) using induction. If m = 1, then by (5.4) we have E(ỹk2 − E ỹk2 )4 ≤
C9,2 , for all k ≥ 1. We assume that (5.6) holds for m and for all n. We have to show that

E
(5.7)
X
n<k≤n+m+1
4
ξ˜k  ≤ C9,1 (m + 1)2 for all n.
Towards this end, let ⌊·⌋ denote integer part, m± (ρ) = m/2 ± 2mρ and set
n+⌊m− (ρ)⌋
S1 (n, m) =
X
ξ˜k ,
S2 (n, m) =
k=n+1
n+m+1
X
ξ˜k ,
n+⌊m+ (ρ)⌋
S3 (n, m) =
k=n+⌊m+ (ρ)⌋+1
ξ˜k .
X
k=n+⌊m− (ρ)⌋
By the triangle inequality, we have


 
E
X
1≤j≤3
4 1/4

Sj (n, m) 
≤ E [S1 (n, m) + S2 (n, m)]4
20
1/4
+ E [S3 (n, m)]4
1/4
.
Also, by (5.4),
E [S3 (n, m)]4
1/4
≤ C9,3 mρ .
Let Ak = {|σk2 − σ̃k2 | ≤ exp(−C9,4 k ρ )} and write ξ˜k = ξk + (ξ˜k −ξk )I{Ak } + (ξ˜k −ξk )I{Ack }.
Let U = {n + 1, n + 2, . . . , n + ⌊m− (ρ)⌋} ∪ {n + ⌊m+ (ρ)⌋, . . . , n + m + 1}. Then,
E [S1 (n, m) + S2 (n, m)]4



4 1/4  
X


ξk   + E  (ξ˜k
X
≤ E 
1/4
k∈U
4 1/4  
X


− ξk )I{Ak }  + E  (ξ˜k
k∈U
k∈U
It follows from the definition of the Ak that

E
X
k∈U
4 1/4

− ξk )I{Ack }  .
4
(ξ˜k − ξk )I{Ak } ≤ C9,5 .
Moreover, by (5.4), (5.3) and Minowski’s and Hölder’s inequalities we obtain that


 
E
≤
4 1/4
− ξk )I{Ack } 

X
(ξ˜k
X
(E(ξ˜k − ξk )4+δ )1/(4+δ) (P {Ack })δ/(4+δ) ≤ C9,6
k∈U
k∈U
≤
E (ξ˜k − ξk )4 I{Ack }
X
k∈U
1/4
X
1≤k<∞
k −µδ/(4+δ) ≤ C9,7 .
Observe that by the stationarity of the ξk ,

E
X
k∈U
4

ξk  = E 
X
X
ξk +
1≤k≤m− (ρ)
m+ (ρ)≤k≤m+1
Arguing as before, we estimate


 
E
X
X
ξk +
1≤k≤m− (ρ)
m+ (ρ)≤k≤m+1
Since the partial sums
ξ˜k
X
4 1/4
ξk  


≤
E 
X
ξ˜k
m+ (ρ)≤k≤m+1
ξk  .
ξ˜k +
1≤k≤m+ (ρ)
X
and
1≤k≤m− (ρ)

4
X
m+ (ρ)≤k≤m+1
4 1/4
ξk  
 +C9,8 .
are independent, if m ≥ m0
with some m0 and E ξ˜k = 0, we obtain that

E
X
ξ˜k +
1≤k≤m− (ρ)

= E
X
X
m+ (ρ)≤k≤m+1
1≤k≤m− (ρ)
4

ξ˜k  + E 
4
ξ˜k 
X
m+ (ρ)≤k≤m+1
4

ξ˜k  + 6E 
21
X
1≤k≤m− (ρ)
2

ξ˜k  E 
X
m+ (ρ)≤k≤m+1
2
ξ˜k  .
Using the induction assumption we get

4
m2
E
ξ˜k  ≤ C9,1
4
1≤k≤m− (ρ)
X

X
6E 
1≤k≤m− (ρ)
2

ξ˜k  E 
2
m
ξ˜k  ≤ C9,1 .
4
m+ (ρ)≤k≤m+1
and
E
Lemma 5.7 yields

4
X
2
2
2 m
.
ξ˜k  ≤ 6C7,1
4
m+ (ρ)≤k≤m+1
X
2
Since we can assume that C9,1 ≥ 6C7,1
, we conclude

E
ξ˜k +
X
1≤k≤m− (ρ)
thus completing the proof.
4
3
ξ˜k  ≤ C9,1 m2 ,
4
m+ (ρ)≤k≤m+1
X
2
Lemmas 5.7–5.9 contain upper and lower bounds for the second and fourth moments
of the increments of the partial sums of the ỹi2 − E ỹi2. As we will see, these inequalities
together with the independence of the ỹj and ỹi, where i < j −j ρ , are sufficient to establish
the strong approximation in Theorem 2.4.
According to Lemma 5.4 it suffices to approximate
ηρ < τ < η − 1 and set
ξ˜i
X
Xk =
and
k η ≤i≤(k+1)η −(k+2)τ
Yk =
(ỹi2 − E ỹi2 ). Let η > 1 + ρ,
P
ξ˜i ,
X
(k+1)η −(k+2)τ <i<(k+1)η
where ξ˜i = ỹi2 − E ỹi2 . Then {Xk } and {Yk } are sequences each consisting of independent
random variables. To show that Xk and Xk+1 are independent, consider the closest lag,
i.e. the index difference of the first ξ˜i in Xk+1 and of the last ξ˜i in Xk . We obtain that
(k + 1)η − [(k + 1)η − (k + 2)τ ] > (k + 1)τ > (k + 1)ηρ which implies the assertion. Similar
arguments apply to {Yk }.
Finally, let
(5.8)

r(N) = E 
X
1≤k≤N
2
Xk 
and

r∗ (N) = E 
X
1≤k≤N
2
Yk 
Lemma 5.10 If the conditions of Theorem 2.4 are satisfied, then there is a Wiener process {W1 (t) : t ≥ 0} such that
X
1≤k≤N
Xk − W1 (r(N)) = o r(N)(κ+1)/4 log r(N)
for any (2η − 1)/(2η) < κ < 1.
22
a.s.
Proof. It follows from the definition of ξ˜i that X1 , X2 , . . . are independent random variables with mean 0. By Lemmas 5.7 and 5.8, there are C10,1 and C10,2 such that
EXk2 ≤ C7,1 ((k + 1)η − (k + 2)τ − k η ) ≤ C10,1 k η−1
and
(5.9)
EXk2 ≥ C8,1 ((k + 1)η − (k + 2)τ − k η ) ≥ C10,2 k η−1 .
Using Lemma 5.9, we can find C10,3 such that
(5.10)
EXk4 ≤ C9,1 ((k + 1)η − (k + 2)τ − k η )2 ≤ C10,3 k 2(η−1) .
Now, (5.9) and (5.10) imply
X
1≤k<∞


X
1≤i≤k
−2κ
EXi2 
EXk4 ≤ C10,4
X
1≤k<∞
k 2(η−1)−2κη < ∞
and the result follows from Theorem A.
2
Lemma 5.11 If the conditions of Theorem 2.4 are satisfied, then there is a Wiener process {W2 (t) : t ≥ 0} such that
X
1≤k≤N
Yk − W2 (r∗ (N)) = o r∗ (N)(κ∗ +1)/4 log r∗ (N)
a.s.
for any (2τ + 1)/(2(1 + τ )) < κ∗ < 1.
Proof. Following the proof of Lemma 5.10, we have EYk2 ≥ C11,1 k τ and EYk4 ≤ C11,2 k 2τ .
Hence,


−2κ∗
X
1≤k<∞

X
1≤i≤k
EYi2 
EYk4 ≤ C11,3
X
1≤k<∞
k 2τ −2(τ +1)κ∗ < ∞,
so the assertion follows from Theorem A.
2
The next lemma gives an upper bound for the increments of
P
2
i≤n (ỹi
− E ỹi2 ).
Lemma 5.12 If the conditions of Theorem 2.4 are satisfied, then
X
˜
max
ξ
i
k η ≤j≤(k+1)η η
j≤i≤(k+1)
= O r(k)(κ+1)/4 log r(k)
where κ is specified in Lemma 5.10.
23
a.s.,
Proof. On account of Lemma 5.9 we can apply Theorem 12.2 of Billingsley (1968, p. 94),
resulting in
P
So,
X
˜i ξ
max
kη ≤j≤(k+1)η j≤i≤(k+1)η 

P
≥λ
X
˜i max
ξ
kη ≤j≤(k+1)η j≤i≤(k+1)η 




≤
C12,1
C12,2 2(η−1)
η
η 2
((k
+
1)
−
k
)
≤
k
.
λ4
λ4
≥ k η(κ+1)/4 log k
Since (η − 1)/η < (2η − 1)/(2η) < κ, we obtain that
X
1≤k<∞
P


X
˜i max
ξ
kη ≤j≤(k+1)η j≤i≤(k+1)η 


≤ C12,2 k 2(η−1)−η(κ+1) .
≥ k η(κ+1)/4 log k



<∞
and therefore using the Borel–Cantelli lemma the proof is complete.
2
For any real n ≥ 1 (not necessarily an integer), we introduce the notations

Tn = E 
X
1≤k≤n
2
ξk 

and
T̃n = E 
X
1≤k≤n
2
ξ˜k  .
The final lemma contains certain properties of the quantities Tn and T̃n .
Lemma 5.13 Let Nk = (k + 1)η . Then we have, with some constant C > 0,
(i)
|T̃Nk − r(k)| ≤ Cr(k)1/2 r∗ (k)1/2
for Nk < n ≤ Nk+1 ,
(ii) |T̃n − T̃Nk | ≤ Ck (2η−1)/2 for Nk < n ≤ Nk+1 ,
√
(iii) |Tn − T̃n | ≤ C n for n ≥ 1.
1/2
Proof. By Minkowski’s inequality, we have |T̃Nk − r(k)1/2 | ≤ r∗ (k)1/2 and thus
1/2
|T̃Nk − r(k)| ≤ r∗ (k)1/2 (T̃Nk + r(k)1/2 ) ≤ r∗ (k)1/2 (2r(k)1/2 + r∗ (k)1/2 ),
proving (i). The proof of (ii) is similar, using Lemma 5.7. Finally, to prove (iii) we apply
Minkowski’s inequality again to get
|Tn1/2 − T̃n1/2 | ≤
n X
k=1
E[ξk − ξ˜k ]2
1/2
and, using the independence of εk and σk2 − σ̃k2 and Lemma 5.5, we obtain
E[ξk − ξ˜k ]2
1/2
1/2
≤ 2 E[σk2 − σ̃k2 ]2 E[ε4k ]
24
= O k −µ/4 .
As µ can be chosen arbitrarily large, we get Tn1/2 − T̃n1/2 = O(1), which implies (iii), since
T̃n = O(n) by Lemma 5.7.
2
Finally, we are in a position to give the proof of Theorem 2.4.
Proof of Theorem 2.4. Putting together Lemmas 5.10, 5.11 and the law of the iterated
logarithm for W2 yields
X
(5.11)
1≤i≤(N +1)η
ξ˜i − W1 (r(N)) = O r(N)(κ+1)/4 log r(N)
a.s.
provided the constants involved are chosen according to the side conditions
(5.12)
η(κ + 1) > 2(τ + 1)
2η − 1
< κ < 1,
2η
and
which assure that the sum of the Yk is of smaller order of magnitude than the remainder
term in Lemma 5.10. Observe that r(N) ≈ N η and r∗ (N) ≈ N τ +1 , where ≈ means that
the ratio of the two sides is between positive constants. Letting now St∗ = ξ˜1 + . . . ξ˜⌊t⌋ ,
t ≥ 0, we get for Nk−1 < n ≤ Nk , using Lemmas 5.12, 5.13, relation (5.11) and standard
estimates for the increments of Wiener processes (see, e.g., Theorem 1.2.1 of Csörgő and
Révész (1981, p. 30)),
∗
∗
|Sn∗ − W1 (Tn )| ≤ |Sn∗ − SN
| + |SN
− W1 (r(k))|
k
k
+|W1 (r(k)) − W1 (T̃Nk )| + |W1 (T̃Nk ) − W1 (T̃n )| + |W1 (T̃n ) − W1 (Tn )|
= O r(k)(κ+1)/4 log r(k) + O r(k)(κ+1)/4 log r(k)
+O r(k)1/4 r∗ (k)1/4 log k + O k (2η−1)/4 log k + O n1/4 log n
(5.13)= O n(κ+1)/4 log n + O n(η+τ +1)/4η log n + O n(2η−1)/4η log n + O n1/4 log n .
In the last step we expressed the remainder terms with n, using the fact that k ≈ n1/η .
Choose now η = 3/2, τ near 0 and κ very near to, but larger than, 2/3. In this case the
relations (5.12) are satisfied and hence (5.13) yields
(5.14)
Sn∗ − W1 (Tn ) = O n5/12+ε
a.s.
for any ε > 0. Now {ξk } is a stationary sequence with zero mean and covariances
E(ξ0 ξk ) = O(k −µ ) for any µ > 0, from which easily follows that

Tn = E 
X
1≤k≤n
2
ξk  = nσ̄ 2 + O(1),
where σ̄ 2 is defined in (2.13). Thus, on applying Theorem 1.2.1 of Csörgő and Révész
(1981, p. 30), we arrive at W1 (Tn ) = W1 (nσ̄ 2 ) + O(log n) a.s., so Theorem 2.4 is readily
proved on setting W (t) = σ̄ −1 W1 (σ̄ 2 t).
2
25
References
[1] Aue, A., and Horváth, L. (2004). Delay time in sequential detection of change. Statistics and Probability Letters 67, 221–231.
[2] Aue, A., and Horváth, L. (2005). A note on the existence of a stationary augmented
GARCH process. Preprint, University of Utah.
[3] Aue, A., Horváth, L., Hušková, M., and Kokoszka, P. (2005). Change–Point Monitoring in Linear Models with Conditionally Heteroskedastic Errors. Preprint, University
of Utah.
[4] Berkes, I., Gombay, E., Horváth, L., and Kokoszka, P. (2004). Sequential change–
point detection in GARCH(p, q) models. Econometric Theory 20, 1140–1167.
[5] Berkes, I., and Horváth, L. (2001). Strong approximation for the empirical process
of a GARCH sequence. The Annals of Applied Probability 11, 789–809.
[6] Berkes, I., Horváth, L., Kokoszka, P., and Shao, Q.–M. (2005). On the almost sure
convergence of the Bartlett estimator. Periodica Mathematica Hungarica (to appear).
[7] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
[8] Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327.
[9] Bougerol, P., and Picard, N. (1992). Stationarity of GARCH processes and of some
nonnegative time series. Journal of Econometrics 52, 115–127.
[10] Brandt, A. (1986). The stochastic equation Yn+1 = An Yn + Bn with stationary coefficients. Advances in Applied Probability 18, 211–220.
[11] Carrasco, M., and Chen, X. (2002). Mixing and moment properties of various
GARCH and stochastic volatility models. Econometric Theory 18, 17–39.
[12] Chow, Y.S., and Teicher, H. (1988). Probability Theory (2nd ed.). Springer, New
York.
[13] Csörgő, M., and Horváth, L. (1997). Limit Theorems in Change–Point Analysis.
Wiley, Chichester.
26
[14] Csörgő, M., and Révész, P. (1981). Strong Approximations in Probability and Statistics. Academic Press, New York.
[15] Darling, D.A., and Erdős, P. (1956). A limit theorem for the maximum of normalized
sums of independent random variables. Duke Mathematical Journal 23, 143–155.
[16] Ding, Z., Granger, C.W.J., and Engle, R. (1993). A long memory property of stock
market returns and a new model. Journal of Empirical Finance 1, 83–106.
[17] Duan, J.–C. (1997). Augmented GARCH(p,q) process and its diffusion limit. Journal
of Econometrics 79, 97–127.
[18] Durrett, R. (1991). Probability: Theory and Examples. Wadsworth & Brooks/Cole,
Pacific Grove, CA.
[19] Eberlein, E. (1986). On strong invariance principles under dependence assumptions.
The Annals of Probability 14, 260–270.
[20] Engle, R.F. (1982). Autoregressive conditional heteroskedasticity with estimates of
the variance of U.K. inflation. Econometrica 50, 987–1008.
[21] Engle, R., and Ng, V. (1993). Measuring and testing impact of news on volatility.
Journal of Finance 48, 1749–1778.
[22] Geweke, J. (1986). Modeling persistence of conditional variances: a comment. Econometric Reviews 5, 57–61.
[23] Giraitis, L., Kokoszka, P., Leipus, R., and Teyssiére, G. (2003). Rescaled variance
and related tests for long memory in volatility and levels. Journal of Econometrics
112, 265–294.
[24] Glosten, L., Jagannathan, R., and Dunke, D. (1993). Relationship between expected
value and the volatility of the nominal excess return on stocks. Journal of Finance
48, 1779–1801.
[25] Hardy, G.H., Littlewood, J.E., and Pólya, G. (1959). Inequalities (2nd ed.). Cambridge University Press, Cambridge, U.K.
[26] Horváth, L., Kokoszka, P., and Zhang, A. (2005). Monitoring constancy of variance
in conditionally heteroskedastic time series. Econometric Theory, to appear.
27
[27] Horváth, L., and Steinebach, J. (2000). Testing for changes in the mean or variance
of a stochastic process under weak invariance. Journal of Statistical Planning and
Inference 91, 365–376.
[28] Kuelbs, J., and Philipp, W. (1980). Almost sure invariance principles for partial sums
of mixing B–valued random variables. The Annals of Probability 8, 1003–1036.
[29] Lo, A. (1991). Long–run memory in stock market prices. Econometrica 59, 1279–
1313.
[30] Nelson, D.B. (1990). Stationarity and persistence in the GARCH(1,1) model. Econometric Theory 6, 318–334.
[31] Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new approach.
Econometrica 59, 347–370.
[32] Petrov, V.V. (1995). Limit Theorems of Probability Theory. Clarendon Press, Oxford,
U.K.
[33] Révész, P. (1990). Random walk in random and nonrandom environments. World
Scientific Publishing Co., Teaneck, NJ.
[34] Schwert, G.W. (1989). Why does stock market volatility change over time? Journal
of Finance 44, 1115–1153.
[35] Strassen, V. (1965). Almost sure behaviour of sums of independent random variables
and martingales. In: Proceedings of the Fifth Berkeley Symposium in Mathematical
Statistics and Probability 2, 315–344.
[36] Straumann, D., and Mikosch, T. (2005). Quasi–maximum–likelihood estimation in
heteroscedastic time series: a stochastic recurrence equations approach. The Annals
of Statistics, to appear.
[37] Taylor, S. (1986). Modelling Financial Time Series. Wiley, New York.
[38] Zakoian, J.–M. (1994). Threshold heteroskedastic models. Journal of Economic Dynamics and Control 18, 931–955.
28
Download