Useful Probability Theorems 1 Convergence in distribution Shiu-Tang Li

advertisement
Useful Probability Theorems
Shiu-Tang Li
Finished: March 23, 2013
Last updated: November 2, 2013
1
Convergence in distribution
Theorem 1.1. TFAE:
(i) µn ⇒ µ, µn , µ are probability measures.
(ii) Fn (x) → F (x) on each continuity point of F .
(iii) E[f (Xn )] → E[f (X)] for all bounded continuous function f .
(iv) φn (x) → φ(x) uniformly on every finite interval. [Breiman]
(v) φn (x) → φ̃(x), φ̃ is continuous at 0.
Theorem 1.2. Let Xn → c in dist, where c is a constant. Then Xn → c
in pr.
Theorem 1.3. Let Xn → X in dist, Yn → 0 in dist, Wn → a in dist,
and Zn → b in dist, a, b are constants. Then
(1) aXn + b → aX + b in dist. (Consider continuity intervals)
(2) Xn + Yn → X in dist.
(3) Xn Yn → 0 in dist.
(4) Wn Xn + Zn = aXn + b + (Wn − a)Xn + (Zn − b) → aX + b in dist.
Remark. We only need to memorize (4). (Chung)
Theorem 1.4. If Xn → X in dist, and f ∈ C(R), then f (Xn ) → f (X)
in dist. (Chung; consider bounded continuous test functions.)
Theorem 1.5. (Lindeberg-Feller; a generalization of CLT.) For
each n, let Xn,m , 1 ≤ m ≤ n, be independent random variables with EXn,m =
1
0. Suppose
P
2
(1) nm=1 E[Xn,m
] → σ 2 >P0 as n → ∞
2
; |Xn,m | > ] = 0.
(2) For all > 0, limn→∞ nm=1 E[Xn,m
Then Sn = Xn,1 + · · · + Xn,n converges to N (0, σ 2 ) in distribution. (Durrett)
2
Convergence in probability
Theorem 2.1. (Chebeshev’s inequality.) If φ is a strictly positive
and increasing function on (0, ∞), φ(u) = φ(−u), and X is an r.v. s.t.
E[φ(X)] < ∞, then for each u ≥ 0, we have P (|X| ≥ u) ≤ E[φ(X)]
. (Chung)
φ(u)
Theorem 2.2. (Useful lower bound) Let X ≥ 0, E[X 2 ] < ∞, and
0 ≤ a < E[X]. We have P (X > a) ≥ (EX − a)2 /E[X 2 ]. (Durrett)
Theorem 2.3. Let Xn → X in pr, Yn → Y in pr, f ∈ C(R). Then
f (Xn ) → f (X) in pr, Xn ± Yn → X ± Y in pr, and Xn Yn → XY in pr.
Theorem 2.4. Xn → X pr. if and only if for every subsequence {Xnj }
of {Xn }, we may pick a further subsequence {Xnjk } so that Xnjk → X a.s.
(Durrett; theorem 1.6.2)
3
Convergence a.s.
Theorem 3.1. (Borel-Cantelli Lemma.)
Theorem 3.2. (Komolgrov’s maximal inequality) Suppose Sn =
X1 + · · · + Xn , where the Xj ’s are independent and in L2 (P ). Then for all
n)
λ > 0 and n ≥ 1, P (max1≤k≤n |Sk − ESk | > λ) ≤ V ar(S
.
λ2
Theorem 3.3. (Kronecker’s Lemma.) Let {Xk } be a sequence
P∞ xof
real numbers, {ak } a positive sequence that goes to infinity. Then n=1 ann
P
converges ⇒ a1n nj=1 xj → 0. (Chung)
Theorem 3.4.
Let {Xn } be a sequence of independent r.v.’s with
E[Xn ] = 0 for every n, and 0 < an ↑ ∞. If φ ∈ C(R) is chosen s.t.
P
E[φ(Xn )]
φ(x)/|x| ↑, φ(x)/x2 ↓, as |x| increases, φ is positive and even, and ∞
n=1 φ(an )
2
converges, then
P∞
Xn
n=1 an
Theorem 3.5.
6.10 of Davar)
converges a.s. (Chung; how to use this?)
Let {Xn } be i.i.d and p > 0. Then TFAE: (exercise
(1) E|Xn |p < ∞.
(2) Xn ∈ o(n1/p ) a.s.
(3) max1≤j≤n |Xj | ∈ o(n1/p ) a.s.
Theorem 3.6. If {Xn } are i.i.d and in L1 (P ), then limn→∞
EX1 a.s. (exercise 6.29 of Davar)
1
ln(n)
P∞
Xj
j=1 j
Theorem 3.7. Let {Xn } be i.i.d with EXn = 0, Sn := X1 + · · · + Xn ,
and we assume that E|Xn |p < ∞. Then: (Durrett)
(1) When p = 1, Snn → 0. (SLLN)
n
→ 0 a.s.
(2) When 1 < p ≤ 2, nS1/p
Sn
(3) When p = 2, n1/2 (log n)1/2+ → 0 a.s. for all > 0.
Remark. When E|Xn |2 = 1, we have lim supn→∞
(Durrett)
Sn
(2n log(log(n)))1/2
= 1.
Theorem 3.8. Let X1 , X2 , · · · be i.i.d. mean 0, variance σ 2 random
variables, and put Sn := X1 + · · · + Xn for each n ≥ 1. Then we have
lim supn Sn = ∞ a.s. (U of Utah spring 2008 qualifying exam; may use CLT
and Komolgrov’s 0-1 law to prove it.)
Theorem 3.9. Let X1 , X2 , · · · be i.i.d. where E[|Xn |] = ∞. Then
lim supn |Snn | = ∞ a.s. (Chung; hint: prove P (|Xn | > n i.o) = 1, and then
P (|Sn | > n i.o) = 1. )
4
Random Series
Theorem 4.1. (Kolmogrov’s 3 series theorem)PLet X1 , X2 , · · · be
independent. Let A > 0 and Yn = Xn 1{|Xn |≤A} . Then ∞
n=1 Xn converges
a.s. if and only if all the following three conditions hold:
P
(1)P∞
n=1 P (|Xn | > A) < ∞.
(2) ∞
n=1 E[Yn ] < ∞.
3
=
P
(3) ∞
n=1 V ar(Yn ) < ∞.
Theorem 4.2. (Levy’s theorem) If X1 , X2 , · · · , Xn , · · · is a sequence of independent random variables, and Sn := X1 + X2 + · · · + Xn
then TFAE:(Varadhan)
(1) Sn converges weakly to some r.v. S.
(2) Sn converges in pr. to some r.v. S.
(3) Sn converges a.s. to some r.v. S.
Remark. For (2) ⇒ (3), we may refer to Theorem 5.3.4 of Chung.
5
Uniform Integrability
Theorem 5.1. The family {Xα } is uniformly integrable if and only if
the following two conditions are satisfied: (Chung)
(1) supα E[|Xα |] < ∞.
(2) For
R every > 0, there exists some δ() > 0 s.t. for any P (E) < δ(),
we have E |Xα | dP < .
Theorem 5.2. Let Xn → X in dist, and for some p > 0 we have that
supn E|Xn |p < ∞. We then have E|Xn |r → E|X|r for every r < p. (Chung:
it provides us with a criterion to check uniform integrability; see Theorem
5.4.)
Theorem 5.3. If supn E|Xn |p < ∞ for some p > 1, then {Xn } is u.i.
p
n|
)1/q )(Chung)
(Consider E[|Xn |; |Xn | > A] ≤ (E|Xn |p )1/p ( E|X
p
A
Theorem 5.4. If {Xn } is dominated by some Y ∈ Lp , and converges
in dist. to X, then E|Xn |p → E|X|p . (Chung; to prove it, first show that
E|X|p < ∞, otherwise we may pick A large so that E[|X|p ∧ A] > E|Y |p ;
also use the convergence E[|Xn |p ∧ A] → E[|X|p ∧ A] for every finite A > 0.
This theorem can be also used to check uniform integrability; see Theorem
5.4.)
Theorem 5.5. Let supn |Xn | ∈ Lp and Xn → X. Then X ∈ Lp (by
Fatou) and Xn → X in Lp (by DCT). (Chung)
Theorem 5.6.
Let 0 < p < ∞, Xn ∈ Lp , and Xn → X in pr. Then
4
TFAE: (Chung)
(1)|Xn |p is u.i.
(2)Xn → X in Lp .
(3)E|Xn |p → E|X|p < ∞.
Theorem 5.7. Given (Ω, F, P ), {E[X|G] : G is a σ-algebra, G ⊂ F } is
u.i. (Durrett; prove it with Theorem 5.1)
Theorem 5.8. (Dunford-Pettis) Let (X, Σ, µ) be any measure space
and A a subset of L1 = L1 (µ). Then A is uniformly integrable iff it is
relatively compact in L1 for the weak topology of L1 . (measure theory II by
D.H.Fremlin)
6
6.1
Martingales
General submartingales
Theorem 6.1. (Producing submartingales with submartingales)
If {Xn } is a martingale and φ is convex, then φ(X) is a submartingale, provided that φ(Xn ) ∈ L1 (P ) for all n. If X is a submartingale and φ is a
nondecreasing convex function, then φ(X) is a submartingale, provided that
φ(Xn ) ∈ L1 (P ) for all n. (Davar)
Theorem 6.2 (Doob’s Decomposition) Any submartingale {Xn } can
be written as Xn = Yn + Zn , where {Yn } is a martingale, and {Zn } is a nonnegative previsible a.s. increasing process with Zn ∈ L1 (P ) for all n. (Davar)
Remark. By the inequalities EZn ≤ E|Xn | − EY1 and E|Yn | ≤ E|Xn | +
EZn , we have that {Xn } is L1 −bounded if and only if both {Yn } and {Zn }
are L1 −bounded, and since E[limn Zn ] < ∞, we have that {Xn } is u.i. if
and only if both {Yn } and {Zn } are u.i. (Chung)
Theorem 6.3. (Doob’s maximal inequality) If {Xn } is a submartingale, then for all λ > 0 and n ≥ 1, and we have (Davar)
(1)λP (max1≤j≤n Xj ≥ λ) ≤ E[Xn ; max1≤j≤n Xj ≥ λ] ≤ E[Xn+ ].
(2)λP (min1≤j≤n Xj ≤ −λ) ≤ E[Xn+ ] − EX1 .
(3)Combine (1) and (2), we have λP (max1≤j≤n |Xj | ≥ λ) ≤ 2E|Xn | −
EX1 .
5
Remarks. (1) If {Xn } is a martingale, then P (max1≤j≤n |Xj | ≥ λ) ≤
for p ≥ 1 and λ > 0. This is because |Xn |p is a submartingale. (2) We
may take p = 2 and replace Xn with Sn − ESn in (1) to obtain Kolmogrov’s
maximal inequality.
E|Xn |p
λp
6.2
Good submartingales (a.s. convergence)
Theorem 6.4 (The Martingale Convergence Theorem) Let {Xn }
be an L1 −bounded submartingale. Then {Xn } converges a.s. to a finite limit.
(Chung)
Remarks. (1) As a corollary, every nonnegative supermartingale and
nonpositive submartingale converges a.s. to a finite limit. (2) It suffices
to assume supn EXn+ < ∞(⇔ supn E|Xn | < ∞). This is due to E|Xn | =
2EXn+ − EXn ≤ 2EXn+ − EX1 . (3) It does not hold conversely. For example,
we may keep throwing a fair coin and bet 10n dollars on the n − th round.
We stop when our money becomes negative.
Theorem 6.5 (Krickeberg’s Decomposition) Suppose {Xn } is a
submartingale which is bounded in L1 (P ). Then we can write Xn = Yn − Zn ,
where {Yn } is a martingale and {Zn } is a non-negative supermartingale.
Remarks. (1) By the inequalities E|Zn | = |EZn | ≤ E|Xn | + |EY1 | and
E|Yn | ≤ E|Xn | + E|Zn | = E|Xn | + EZn , we have that both {Yn } and {Zn }
are L1 −bounded. (Davar) (2) Unlike Theorem 6.2, when {Xn } is u.i., it does
not imply that both {Yn } and {Zn } are u.i.. To see this, we may simply
assume that {Xn } is a nonpositive submartingale which is not u.i..
6.3
Better submartingales (L1 convergence or Lp convergence, p > 1)
Theorem 6.6 For a submartingale, TFAE: (Durrett)
(1) It is u.i.
(2) It converges a.s. and in L1 .
(3) It converges in L1 .
6
Remark. Say, it converges to X∞ , which can be regarded as “the last
term” of the original submartingale.
Theorem 6.7 For a martingale, TFAE: (Durrett)
(1)
(2)
(3)
(4)
It is u.i.
It converges a.s. and in L1 .
It converges in L1 .
There exists X with E|X| < ∞ so that Xn = E[X|Fn ].
Theorem 6.8 Let Fn ↑ F , then E[X|Fn ] → E[X|F ] a.s. and in L1 .
(Durrett)
Remark. For a generalization, see Chung Theorem 9.4.8.
Theorem 6.9 (Lp maximum inequality) If Xn is a nonnegative submartingale
s.t. supn E[Xnp ] < ∞, then for 1 < p < ∞, E[(max1≤j≤n Xj )p ] ≤
p
1−p
p
E[Xn ]p . (Davar; Chung)
Remark. That is, if a nonnegative submartingale if Lp -bounded, then
it is bounded by a Lp function. This inequality also holds if we replace Xn
with |Xn | and submartingales with martingales.
Theorem 6.10 (L1 maximum inequality) If Xn is a submartingale,
1
(1 + E[Xn+ max(log(Xn+ ), 0)]). (Durrett)
then E[max1≤j≤n Xj+ ] ≤ 1−1/e
Remark. By Theorem 5.5 and this theorem, we have another sufficient
condition for L1 convergence martingales.
Theorem 6.11 (Lp convergence theorem for martingales or nonnegative submartingales) If Xn is a nonnegative submartingale or a martingale with supn E|Xn |p < ∞, p > 1, then Xn → X a.s and in Lp . (Durrett)
Remark. For a proof, see Durrett Theorem 4.4.5 or Varadhan Theorem
5.6; we may also use (a) Theorem 5.5 + 6.9 (b) Theorem 5.4 + 5.6 + 6.9 to
prove it.
Theorem 6.12 (Lp convergence theorem for submartingales) If
Xn is a submartingale with supn E|Xn |p < ∞, p > 1, then Xn → X a.s and
in Lr for every 1 ≤ r < p. (A combination of various theorems.)
7
Remarks. We may use Theorem 5.2 + 5.6 + 6.4 (martingale convergence theorem) to prove it. (Here we don’t use theorem 6.9)
7
Stopping times
Theorem 7.1 (“generalized Optional Stopping Theorem”) If L ≤
M are stopping times and {YM ∧n } is a uniformly integrable submartingale,
then EYL ≤ EYM and YL ≤ E[YM |FL ].(Durrett)
Remarks. (1) If L ≤ M are two bounded stopping times (Say, L ≤
M ≤ k), then |YM ∧n | ≤ |Y1 | + · · · + |Yk |, which means {YM ∧n } is u.i.. Such
case is usually called Optional Stopping Theorem. (2) There’re two sufficient
conditions for {YM ∧n } to be u.i.(Durrett): (a) {YM ∧n } is a u.i. submartingale.
(b) E|YM | < ∞ and {Yn 1{M >n} } is u.i.
8
Truncation
Theorem 8.1 (L1 -Truncation) Let X1 , X2 , · · · be i.i.d with E|X1 | <
∞, and let Yn = Xn 1|Xn |≤n . Then P (Xn 6= Yn i.o.) = 0.
Theorem 8.2 (Lp -Truncation, p > 1) Let X1 , X2 , · · · be i.i.d with
E[|X1 |p ] < ∞, and let Yn = Xn 1|Xn |≤n1/p . Then P (Xn 6= Yn i.o.) = 0.
9
Examples of Martingales
Example 9.1 Let X1 , X2 , · · · be independent and EXn = 0 for all n ≥ 1.
Then Sn := X1 + · · · Xn is a martingale.
Example 9.2 Let X1 , X2 , · · · be independent, EXn = 1 for all n ≥ 1,
and Yn := X1 × · · · × Xn . Then {Yn } is a martingale.
Example 9.3 (Related to densities- Likelihood ratios) Suppose f
and g are two strictly positive probability density functions on R. If {Xi }∞
i=1
are i.i.d. random variables with probability density f , then Πnj=1 [g(Xj )/f (Xj )]
defines a mean-one martingale. (Davar)
8
2
Example 9.4 Let X1 , X2 , · · · be independent, E|X
and EXn = 0
Pn n | < ∞
2
for all n ≥ 1, and Sn := X1 + · · · Xn . Then {Sn − j=1 E[Xj2 ]} is a martingale.
Example 9.5 If Xn and Yn are submartingales w.r.t. Fn then Xn ∨ Yn
also is. (Durrett)
Example 9.6 Suppose E|Xn | < ∞ for all n ≥ 1, and E[Xn+1 |X1 , · · · , Xn ] =
n
Then { X1 +···+X
, σ(X1 , · · · , Xn )} is a martingale. (Chung)
n
X1 +···+Xn
.
n
Example 9.7 (Martingales induced from Markov chains.) Let
{Xn } be a Markov chain and f = P f (That is, f is p-harmonic). Then
f (Xn ) is a martingale. Note that if {Xn } is recurrent then f is trivial. (Karlin)
Example 9.8 (Martingales induced from Brownian Motions(I))
exp(θBt − (θ2 t/2)) is a martingale for all θ ∈ R. (Durrett)
Example 9.9 (Martingales induced from Brownian Motions(II))
Bt , Bt2 − t, Bt3 − 3tBt , Bt4 − 6tBt2 + 3t2 , · · · are all martingales. (Durrett)
Remark. When we’re given a 1-parameter family of martingales, we
may differentiate with respect to that parameter and see what happens.
Example 9.10
Pn Let Y1 , Y2 , · · · be independent random variables. Define
X0 = 0, Xn =
j=1 Yj − E[Yj |Y1 , · · · , Yj−1 ] for n ≥ 1. Then {Xn } is a
martingale. (A probability path; Resnick)
10
References
Main reference: Chung, Durrett, Davar.
Other references: Varadhan, Shiryaev, Resnick, Karataz, Karlin, Breiman.
9
Download