# Probability: Theory and Examples Contents Adam Bowditch March 29, 2014

```Probability: Theory and Examples
March 29, 2014
Contents
1 Brownian Motion
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
1.2 Path Regularity of Brownian Motion . . . . . . . . .
1.3 Donsker’s Theorem . . . . . . . . . . . . . . . . . . .
1.4 Martingales . . . . . . . . . . . . . . . . . . . . . . .
1.5 Strong Markov Property and the Reflection Principle
.
.
.
.
.
2
2
6
7
10
14
2 Levy Processes
2.1 Infinitely Divisible Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
17
3 Markov Processes
3.1 Random Conductance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Heat Kernel Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Green Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
26
29
32
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
Brownian Motion
1.1
Introduction
Definition 1.1. Let (M, M, &micro;) be a measure space where &micro; is a σ-finite, non-atomic measure. A
white noise is a collection of random variables {hη, ϕi}ϕ∈L2 (M,M,&micro;) such that ∀ϕ ∈ L2 (M, M, &micro;) we
have that hη, ϕi is a centred Gaussian random variable and ∀ϕ1 , ϕ2 ∈ L2 (M, M, &micro;)
Z
E[hη, ϕ1 i hη, ϕ2 i] = ϕ1 (x)ϕ2 (x)&micro;(dx)
Intuitively
we want η to be some random Gaussian function such that for x 6= y we have that
`
η(x) η(y). We can construct this using an approximation. For N ≥ 1 let Ij := [j/N, (j + 1)/N ), take
i.i.d.
2
ξjN ∼ N (0, σN
) then set
N
−1
X
ηN (x) =
ξjN χIj (x)
j=0
For ϕ1 6= ϕ2 : [0, 1] → R smooth we have that
Z 1
hηN , ϕ1 i =
ϕ1 (x)ηN (x)dx
=
0
N
−1 Z (j+1)/N
X
j=0
E[hηN , ϕ1 i hηN , ϕ2 i] =
=
−1 Z (j+1)/N
N
−1 N
X
X
j=0 k=0
j/N
N
−1
X
(j+1)/N
2
σN
j=0
which converges to
R1
0
ϕ1 (x)dxξjN
j/N
Z
Z
(k+1)/N
ϕ2 (x)dxE[ξjN ξkN ]
ϕ1 (x)dx
k/N
Z
(j+1)/N
ϕ1 (x)dx
j/N
ϕ2 (x)dx
j/N
2
= N.
ϕ1 (x)ϕ2 (x)dx if σN
Remark 1.1.
1. It is possible to view white noise as a random distribution in which case η is a random
n
distribution in H − 2 −τ for τ &gt; 0.
2. If η were a random function with E[η(x)η(y)] = C(x, y) then
Z Z
E[hηN , ϕ1 i hηN , ϕ2 i] =
ϕ1 (x)ϕ2 (y)C(x, y)dxdy
hence E[η(x)η(y)]δx,y .
3. For approximation N the joint density of the random function is proportional to
! N −1 N −1
k 2
Y
)
1 X η( N
k
exp −
dη
2
2
σN
N
k=0
k=0
Definition 1.2. A Brownian motion on probability space (Ω, F, P) is a random function (Bt )t∈R+
such that
1. Bt is a centred Gaussian random variable;
2. E[Bt Bs ] = t ∧ s for any s, t ∈ R+ ;
3. t → Bt is continuous.
2
Lemma 1.1. Given Bt is a centred Gaussian random variable the second condition of the definition is
equivalent to Bt having independent increments with E[(Bt − Bs )2 ] = t − s.
Corollary 1.1. Let Bt be a Brownian motion then so are
1. λBλ2 t for λ 6= 0;
2. Bt+s − Bs for s &gt; 0;
3. tB1/t .
Definition 1.3.
Hkn (t) := 2n/2 χ( k−1 , k ] − 2n/2 χ( k , k+1 ]
n
n
n
n
is called a Haarwavelet.
Lemma 1.2. The Haarwavelets form an orthonormal basis of L2 (Ω).
Definition 1.4.
Skn (t)
t
Z
Hkn (s)ds
:=
0
is called a Schauder function.
Lemma 1.3. Let X N be Gaussian vectors in Rn converging in distribution to X then X is Gaussian.
Theorem 1.1. A Brownian motion exists (and is C α ).
i.i.d.
(n)
Proof. Let Dn := 2kn : k ∈ [0, 2n ] ∩ N denote In := Dn \ Dn−1 then define {ξj }j∈In ∼ N (0, 1)
(m)
independent of {ξk }k∈Im for m &lt; n.
Define the Haarwavelets:
H
(n)
k
2n
(t) = 2n/2 χ( k−1
n ,
2
k
2n
] (t) − 2
Z
t
and the Schauder functions
S
(n)
k
2n
(t) =
H
0
(n)
(n)
k
2n
n/2
χ( kn , k+1
(t)
2
2n ]
(s)ds
Let bn = supj∈In |ξj | then ∀n ∈ N, j ∈ In , x &gt; 0 we have that
(n)
P(ξj
y2
∞
e− 2
√ dy
&gt; x) =
2π
x
Z ∞
y2
1
ye− 2 dy
≤ √
x 2π x
∞
2
1
− y2
√
=
−e
x 2π
x
Z
x2
e− 2
= √
x 2π
(n)
(n)
P(|ξj | &gt; x) = 2P(ξj &gt; x)
r
x2
2 e− 2
≤
π x
X
(n)
(n)
P( sup |ξj | &gt; x) ≤
P(|ξj )
j∈In
j∈In
r
≤2
n−1
x2
2 e− 2
π x
3
Choosing x = n gives
n2
(n)
P( sup |ξj | &gt; n) ≤
j∈In
Since
2 n e− 2
√
n 2π
2
n
∞
X
2n e− 2
√
&lt;∞
n 2π
n=1
Borel-Cantelli gives us that
∞ [
∞
\
P
!
sup
n=1 m=n j∈Im
(n)
|ξj |
&gt;m
=0
i.e. almost surely only finitely such incidences occur, so for almost every ω ∈ Ω ∃n(ω) such that
∀m &gt; n(ω) we have that bm ≤ m.
We define the approximation sequence
(n)
Bt
:=
n X
X
(m)
ξj
(m)
Sj
(t)
m=0 j∈Im
(n)
Fix α ∈ (0, 1/2) then we want to show that Bt is uniformly C α that is ∃C &gt; 0 (independent of n)
(n)
(n)
such that |Bt+s − Bt | ≤ C|s|α almost surely.
n
X
X
(n)
(m)
(m)
(m)
(n)
ξj (Sj (t + s) − Sj (t))
|Bt+s − Bt | = m=0 j∈Im
We split into cases:
1. For |s| &lt; 2−m :
(m)
We have that the maximum gradient of the Schauder function Sj (t) is 2m/2 hence
(m)
|Sj
(m)
(t + s) − Sj
(t)| ≤ |s|2m/2
≤ |s|α |s|1−α 2m/2
≤ |s|α 2−m(1−α) 2−m/2
≤ |s|α 2−m(1/2−α)
Furthermore notice that there are only at most 2 such j ∈ Im such that
(m)
(m)
|Sj (t + s) − Sj (t)| &gt; 0 since the Schauder functions are null on the intersections of their
supports hence
X (m)
(m)
|Sj (t + s) − Sj (t)| ≤ 2|s|α 2−m(1/2−α)
j∈Im
2. For |s| ≥ 2−m :
(m)
0 ≤ Sj (t) ≤ 2−m/2 hence
(m)
|Sj
(m)
(t + s) − Sj
(t)| ≤ 2−m/2
≤ |s|α 2−m/2 2αm
≤ |s|α 2−m(1/2−α)
Furthermore notice that there are only at most 2 such j ∈ Im such that
(m)
(m)
|Sj (t + s) − Sj (t)| &gt; 0 since the Schauder functions have disjoint support hence
X (m)
(m)
|Sj (t + s) − Sj (t)| ≤ 2|s|α 2−m(1/2−α)
j∈Im
4
P
(m)
(m)
So j∈Im |Sj (t + s) − Sj (t)| ≤ 2|s|α 2−m(1/2−α) .
So we have that
n
X
X
(m)
(m)
(m)
(n)
(n)
ξj (Sj (t + s) − Sj (t))
|Bt+s − Bt | = m=0 j∈Im
n
X
≤
bm
m=0
(m)
X
|Sj
(m)
(t + s) − Sj
(t)|
j∈Im
n
X
α
bm 2−m(1/2−α)
m=0
∞
X
C|s|α
bm 2−m(1/2−α)
m=0
≤ C|s|
≤
P∞
Where m=0 bm 2−m(1/2−α) is almost surely finite since α &lt; 1/2 and the fact that for almost every
ω ∈ Ω ∃n(ω) such that ∀m &gt; n(ω) we have bm ≤ m.
(n)
This gives us that Bt is uniformly C α hence any uniform limit is also C α . We now want to show
(n)
that {Bt }∞
n=1 is a Cauchy sequence in the supremum norm. For almost every ω ∈ Ω ∃n(ω) such that
∀m &gt; n(ω) we have that bm ≤ m. This gives us that
∞
X
X
(m)
|ξj
(m)
|Sj
∞
X
(t) ≤
m=n(ω) j∈Im
X
(m)
mSj
(t)
m=n(ω) j∈Im
∞
X
≤
m
m=n(ω)
∞
X
≤
X
(m)
Sj
(t)
j∈Im
m2−m/2 &lt; ∞
m=n(ω)
(n)
hence Bt
is Cauchy in the supremum norm since for k &gt; m &gt; n(ω) we have that
k
X X (l) (l) (k)
(m)
ξj Sj (t)
|Bt − Bt | = l=m+1 j∈Il
≤
k X
X
(l)
(l)
|ξj |Sj (t)
l=m j∈Il
∞ X
X
≤
(l)
(l)
|ξj |Sj (t)
l=m j∈Il
∞
X
m2−l/2
≤
l=m
which can be made arbitrarily small by choosing sufficiently large m.
(n)
So we have that Bt converges to some Bt (α-Holder continuous) almost surely; so it remains to show
(n)
that we have the desired limit. Since Bt is a converging sequence of Gaussian processes the limit
must also be Gaussian and hence it suffices to show that E[Bt ] = 0 and E[Bt Bs ] = t ∧ s.
5

E[Bt ] = E 
∞ X
X

(m)
ξj
(m)
Sj
(t)
m=0 j∈Im
=
∞ X
X
(m)
Sj
(m)
(t)E[ξj
]
m=0 j∈Im
=0

E[Bt Bs ] = E 
∞ X
X

(m)
ξj
(m)
Sj
(t)
=
=
=
X X
!
(n) (n)
ξk Sk (s) 
n=0 k∈In
m=0 j∈Im
∞ X
∞
X
∞ X
X
(m)
Sj
(n)
(m) (n)
ξk ]
(t)Sk (s)E[ξj
m=0 n=0 j∈Im k∈In
∞ X
X
(m)
(m)
Sj (t)Sj (s)
m=0 j∈Im
∞ X D
ED
E
X
(m)
(m)
Hj , χ(0,t) Hj , χ(0,s)
m=0 j∈Im
(m)
ξj
i.i.d.
∼ N (0, 1)
= χ(0,t) , χ(0,s)
=t∧s
Where the sums commute because convergence in probability implies convergence in Lp ∀p ∈ [1, ∞) for
sequence of Gaussian random variables.
Remark 1.2. We have constructed a continuous random function whose distribution is a probability
measure on C[0, 1] called the Wiener measure. A slightly simpler proof can be given to only show that
Bt exists.
1.2
Path Regularity of Brownian Motion
Definition 1.5. For (Xt )t∈[0,1]n we say that Xt is α-Holder-continuous if
sup
t6=s
|Xt − Xs |
=: [X]C α &lt; ∞
|t − s|α
Theorem 1.2. Kolmogorov’s Continuity Theorem
Let (Xt )t∈[0,1]n be a continuous random function such that ∃p, β, C &gt; 0 such that ∀t, s we have that
E[|Xt − Xs |p ] ≤ C|t − s|n+β
then E[[X]C α ] &lt; ∞ for all α &lt; β/p.
S
Proof. Let Dm = {2−m Zn ∈ [0, 1]n } and D = m Dm be the diadic points. Then by continuity
t −Xs |
t −Xs |
= supt6=s∈D |X|t−s|
[X]C α = supt6=s |X|t−s|
α
α .
S
−m
Let ∆m = {(t, s) ∈ Dm : |t − s|∞ = 2 } and ∆ = m ∆m be collections of neighbours. Then
|∆m | ≤ 2mn 3n .
6
Write Rm := sup(t,s)∈∆m

p
E[Rm
] ≤ E
X
(t,s)∈∆m
≤
X
(t,s)∈∆m
|Xt −Xs |
|t−s|α
then

|Xt − Xs | 
|t − s|α
|t − s|n+β
|t − s|α
≤ C2mn 3n 2−m(−αp+n+β
= Cn 2m(αp−β)
P
p
p
hence if α &lt; β/p we have that E[supm Rm
] ≤ E [ m Rm
] &lt; ∞ and hence
&quot;
#
|Xt − Xs |p
E sup
&lt;∞
α
(t,s)∈∆ |t − s|
It remains to show that
sup
(t,s)∈D
|Xt − Xs |p
|Xt − Xs |p
≤
C
sup
α
|t − s|α
(t,s)∈∆ |t − s|
For fixed t, s ∈ D there exists m ∈ N such that 2−(m+1) ≤ |t − s| &lt; 2−m . We can choose tm , sm ∈ Dm
such that (tm , sm ) ∈ ∆m (or sm = tm ) and |tm − t|, |sm − s| &lt; 2−m .
We can then construct sequences {tn }n≥m , {sn }n≥m such that (tn , tn+1 ), (sn , sn+1 ) ∈ ∆n then
∞
X
|X(t) − X(s)| ≤ |X(tm ) − X(sm )| +
|X(tn ) − X(tn+1 )| + |X(sn ) − X(sn+1 )|
n=m
≤C
mα
2
+s
∞
X
!
2
−nα
n=m
≤ C2−mα
≤ C|t − s|α
Theorem 1.3. Let Bt be a Brownian motion on [0, 1] then ∀p &gt; 1 and 0 &lt; α &lt; 1/2 we have that
E[[B]C α ] &lt; ∞.
Proof.
p
E[|Bt − Bs | ] = E
Bt − Bs
|t − s|1/2
p |t − s|p/2 = Cp |t − s|p/2
hence E[[B]C α ] &lt; ∞ by Kolmogorov’s Continuity.
1.3
Donsker’s Theorem
Definition 1.6. The sequence of random variables Xn converges weakly to X if E[f (Xn )] → E[f (X)]
for all f : C[0, 1] → R continuous, bounded.
Remark 1.3. If the sequence of random variables Xn take values in (S, d) and converge in
distribution to X then for any continuous f : S → S we have that f (Xn ) converge weakly to f (X).
Example 1.1. Xt → supt |Xt | is continuous.
(n)
Definition 1.7. If Xt is a sequence of random variables then we say that the finite dimensional
(n)
(n)
distributions converge if for any {tj }kj=1 then (Xt1 , ..., Xtk ) converges in distribution to (Xt1 , ..., Xtk ).
7
(n)
i.i.d.
Example 1.2. Consider ξk ∼ U[0, 1] random variables and Xt hat functions on
(n)
(n)
(ξn − 1/n, ξn + 1/n). The finite dimensional distributions converge but supt Xt = 1 a.s. hence Xt
doesn’t converge in distribution.
Proving convergence of a process to a limiting process usually involves
1. Prove compactness of the sequence;
2. Identify limits.
Definition 1.8. Let Π be a family of probability measures on (S, d). We say that Π is relatively
compact if every sequence &micro;n ⊂ Π has a subsequence which converges weakly to a limit &micro;.
Definition 1.9. A family of probability measures is called tight if for any ε &gt; 0 ∃Kε ⊂ S compact such
that ∀&micro; ∈ Π we have that &micro;(Kεc ) &lt; ε.
Example 1.3. The family (δn )n∈N is not tight.
Corollary 1.2. If &micro;n converges weakly to &micro; then {&micro;n } is tight.
Definition 1.10. A stochastic process is called tight if the distributions are tight.
Theorem 1.4. Prokhorov
Let (S, d) be a complete separable metric space. Then a family of probability measures Π on S is
relatively compact if and only if it is tight.
Lemma 1.4. Kolmogorov’s Tightness Criterion
(n)
If Xt is a family of continuous stochastic processes for t ∈ [0, 1]d such that ∃Cp,β where ∀s, t, n we
(n)
(n)
(n)
have E[|Xt − Xs |p ] ≤ C|t − s|d+β then Xt is tight on C α (0, 1) ∀α &lt; β/p.
Proof. C β embeds compactly for C α for β &gt; α.
Fix α &lt; α &lt; β/p then by Kolmogorov’s continuity criterion ∃C independent of n such that
supn E[||X||pC α ] &lt; C hence
P(||X (n) ||C α ≥ N ) ≤
E[||X (n) ||pC α ]
Np
C
≤ p
N
Furthermore {X : ||X||C α ≤ N } is compact in C α so indeed the collection is tight.
T
Remark 1.4. C α is not separable but the processes take values in β&gt;α C β which is separable.
Definition 1.11. Let X : [0, 1]d → R be a function then we define the modulus of continuity as
ω(X, δ) = sup|t−s|&lt;δ |Xt − Xs |
Theorem 1.5. Arzela-Ascoli
A ⊂ C([0, 1]d ) is relatively compact if and only if
1. supx∈A |X(0)| &lt; ∞;
2. limδ→0 supx∈A ω(X, δ) = 0.
Lemma 1.5. Let Π be a family of probability measures on C[0, 1]d . Then Π is tight if and only if
1. limλ→∞ sup&micro;∈Π &micro;(X : |X(0)| &gt; λ) = 0;
2. limδ→0 sup&micro;∈Π &micro;(X : ω(X, δ) &gt; ε) = 0 for all ε &gt; 0.
Proof. Suppose Π is tight. Fix η &gt; 0 then we want to show the ∃λ, δ such that
1. ∀&micro; ∈ Π
&micro;(X : |X(0)| &gt; λ) ≤ η;
8
2. ∀ε &gt; 0
&micro;(X : ω(X, δ) &gt; ε) ≤ η.
We can find K compact such that ∀&micro; ∈ Π we have that &micro;(K c ) ≤ η. Choose λ = supX∈K |X(0)| then
∀&micro; ∈ Π we have &micro;(|X(0)| &gt; λ) ≤ &micro;(K c ) ≤ η.
∃δ &gt; 0 such that ∀X ∈ K such that ω(X, δ) ≤ ε so &micro;(X : ω(X, δ) &gt; ε) ≤ &micro;(K c ) ≤ η.
Lemma 1.6. Let {Xn }n∈N be a stochastic process. Then TFAE
1. Xn converges weakly to X;
2. ∀A open lim inf n→∞ P(An ∈ A) ≥ P(X ∈ A);
3. ∀A closed lim supn→∞ P(An ∈ A) ≤ P(X ∈ A);
Theorem 1.6. Donsker
Let (ξkP
)∞
k=1 be an i.i.d. sequence of random variables with zero mean and unit variance, define
n
Sn := k=1 ξk and
1
(n)
Xt = √ Sbntc + (nt − bntc)ξbntc+1
n
(n)
Then Xt
converges weakly to Bt with respect to the C[0, 1] topology.
(n)
Proof. We want to show that Xt
is tight i.e. that for any ε &gt; 0
lim sup P(ω(δ, X) &gt; ε) = 0
δ→0+ n→∞
However it suffices to show that
lim sup lim sup P(ω(δ, X) &gt; ε) = 0
δ→0+
n→∞
since this implies that ∀γ &gt; 0 ∃τ &gt; 0, N ∈ N such that ∀n ≥ N, δ &lt; τ we have that
P(ω(δ, Xn ) &gt; ε) &lt; γ.
There are only finitely many such n &lt; N so for such n ∃δn &gt; 0 such that P(ω(δn , Xn ) &gt; ε) &lt; γ by
continuity. Taking min{τ, δ1 , ..., δN } &gt; 0 gives the desired result.
By rearranging it suffices to show that
√
lim sup lim sup P( sup
sup |Sk+j − Sk | ≥ ε n) = 0
δ→0+
Let M =
(
n
nδ
≤
sup
n→∞
0≤k&lt;n 0≤j≤bnδc
2
δ
then notice that by the triangle inequality we have that
) (
)
√
√
sup |Sk+j − Sk | ≥ ε n ⊆
sup
sup |Skbnδc+j − Skbnδc | ≥ ε n/3
0≤k&lt;n 0≤j≤bnδc
0≤k&lt;M 0≤j≤bnδc
So we have
P
sup
sup
|Skbnδc+j
!
M
X
√
− Skbnδc | ≥ ε n/3 ≤
P
0≤k&lt;M 0≤j≤bnδc
2
≤ P
δ
by the reflection principle.
We need to show that this converges to zero as n → ∞.
9
|Skbnδc+j − Skbnδc | ≥ ε n/3
0≤j≤bnδc
k=0
≤
sup
√
sup
√
|Sj | ≥ ε n/3
0≤j≤bnδc
√
2
P |Sbnδc | ≥ 2ε n/3
δ
!
!
√
By the central limit theorem Sbnδc / n converges weakly to a centred Gaussian with variance δ hence
√
letting Y ∼ N (0, 1), X = δY ∼ N (0, δ) we have that by the previous lemma
√
2
2 lim sup P |Sj | ≥ 2ε n/3 = lim sup P |Y | ≥ 2ε/3δ 1/2
δ→0+ δ
δ→0+ δ
≤ lim sup
δ→0+
C
δ 1/2
e−
1
2ε δ − 2
3
!2
2
=0
Lemma 1.7. Gaussian Tails
If Z is a standard Gaussian random variable then for x &gt; 0 we have that
x2
x2
1
x
√ e− 2 ≤ P(Z ≥ x) ≤ √ e− 2
(x2 + 1) 2π
x 2π
Lemma 1.8. Borel-Cantelli
Let {An }∞
n=1 be a sequence of events then:
T
P∞
∞ S
1. If n=1 P(An ) &lt; ∞ then P
A
n=1 m≥n n = 0.
2. If {An }∞
n=1 are independent and
P∞
n=1 P(An ) = ∞ then P
T
∞
n=1
S
A
= 1.
n
m≥n
Theorem 1.7.
p Law of The Iterated Logarithm
Let ψ(t) := 2t log(log(t)) for t &gt; 1 and let Bt be a standard Brownian motion then
lim sup
t→∞
Bt
=1
ψ(t)
a.s.
Proof. Use Gaussian Tails estimate and Borel-Cantelli Lemma.
1.4
Martingales
For the start of this section we shall consider results for discrete time martingales on the state space R
however many of the results extend to continuous time.
Definition 1.12. A filtration {Fn }∞
n=0 of (Ω, F, P) is an increasing sequence of sub-σ-algebras of F.
Definition 1.13. We say that the stochastic process Xn is adapted to the filtration Fn if for all n we
have that Xn is Fn measurable.
Definition 1.14. If Xn is adapted to Fn such that E[|Xn |] &lt; ∞ for all n then we call Xn
1. a martingale if E[Xn+1 |Fn ] = Xn ;
2. a sub-martingale if E[Xn+1 |Fn ] ≥ Xn ;
3. a super-martingale if E[Xn+1 |Fn ] ≤ Xn .
Definition 1.15. A process An is called predictable if ∀n we have that An is measurable with respect
to Fn−1 .
Definition
Pn1.16. For processes X, A we define the martingale transform
(AX)n = i=1 Ai (Xi − Xi−1 ).
Theorem 1.8. If Xn is a (sub/super) martingale and An a bounded, predictable process then AX is
also a (sub/super A ≥ 0) martingale.
10
Proof.
E[(AX)n+1 |F] − (AX)n = E[(AX)n + An+1 (Xn+1 − Xn )|Fn ] − (AX)n
= An+1 (E[Xn+1 |Fn ] − Xn )
=0
Definition 1.17. A random variable T taking values in N is called a stopping time if ∀n ∈ N we have
that {T ≤ n} ∈ Fn .
Theorem 1.9. Optional Stopping Theorem
Let Xn be a (sub/super) martingale and T a stopping time then the stopped process XnT = Xn∧T is
also a (sub/super) martingale. In particular if T is bounded a.s. then E[XT ] = E[X0 ] (≥ / ≤).
Proof. Choose An := χ{T ≥n} ∈ Fn−1 then XnT = X0 + (AX)n and is therefore a martingale.
Choose n greater than the a.s. bound on T then E[Xn∧T ] = E[X0∧T ] = E[X0 ].
Definition 1.18. For a &lt; b ∈ R and a sequence Xn we define
S1 := inf{n : Xn ≤ a}, Sk = inf{n &gt; Tk−1 : Xn ≤ a}, Tk = inf{n &gt; Sk : Xn ≥ b}. Then the number of
up-crossings of [a, b] by time n is Nn ([a, b], X) := sup{k : Tk ≤ n}.
Theorem 1.10. Doob’s Up-crossing
If Xn is a sub-martingale then ∀a &lt; b, n ∈ N we have that
(b − a)E[Nn ([a, b], X)] ≤ E[(Xn − a)+ − (X0 − a)+ ]
Proof. Write Yn = (Xn − a)+ which is also a sub-martingale by Jensen’s inequality then define
An =
∞
X
χ{Sk &lt;n≤Tk }
k=1
be the event that the process is on an up-crossing at time n. So by a telescoping sum we have that
(AY )n =
Nn
X
YTi − YSi + (Yn − YSNn +1 )χ{SNn +1 &lt;n}
i=1
Hence since Yn is a sub-martingale we have that E[(AY )n ] ≥ (b − a)E[Nn ].
Write K = 1 − A to be the event that the process is in a down-crossing then (KY ) is also a
sub-martingale.
Yn − Y0 = (KY )n + (AY )n so
(b − a)E[Nn ] ≤ E[(AY )n ]
= E[Yn − Y0 ] − E[(KY )n ]
≤ E[Yn − Y0 ]
Corollary 1.3. Every sub-martingale with supn E[Xn+ ] &lt; ∞ converges almost surely.
Proof. If Xn doesn’t converge almost surely then there exists an interval with rational endpoints [a, b]
which is crossed infinitely often but this contradicts Doob’s up-crossing.
Theorem 1.11. Doob’s Maximal Inequality
Let Xn be a sub-martingale and a &gt; 0 then define Xn∗ := max0≤i≤n Xi then
aP(Xn∗ ≥ a) ≤ E[Xn χ{Xn∗ ≥a} ] ≤ E[Xn+ ]
11
Proof. Let T = inf{n : Xn ≥ a} then {Xn∗ ≥ a} = {T ≤ n} so by optional stopping theorem we have
that
E[Xn ] ≥ E[XT ∧n ]
= E[XT χT ≤n ] + E[Xn χT ≥n ]
≥ aP(Xn∗ ≥ a) + E[Xn χT ≥n ]
So indeed
aP(Xn∗ ≥ a) ≤ E[Xn χXn∗ ≥a ]
Theorem 1.12. Let Xn be a positive sub-martingale, then for p &gt; 1 we have that
E[|Xn∗ |p ]1/p ≤
p
E[|Xn |p ]1/p
p−1
Proof. From Doob’s maximal inequality we have that
Z
0
∞
aP(Xn∗ ≥ a) ≤ E[Xn χXn∗ ≥a ]
Z ∞
ap−1 P(Xn∗ ≥ a)da ≤
ap−2 E[Xn χXn∗ ≥a ]da
0
Z ∞
1
E[(Xn∗ )p ] =
ap−1 P(Xn∗ ≥ a)da
p
0
Z ∞
≤
ap−2 E[Xn χXn∗ ≥a ]da
0
&quot;
#
Z ∗
Xn
= E Xn
ap−2 da
0
1
E[Xn (Xn∗ )p−1 ]
=
p−1
p
≤ E[Xn ]1/p E[(Xn∗ )p ] p−1
So dividing through gives the desired result.
Definition 1.19. For a continuous time process Xt and filtration Ft = σ({Bs }s≤t ) we say that T is a
stopping time if {T ≤ t} ∈ Ft for all t and we say that T is an optional time if {T &lt; t} ∈ Ft for all t.
Remark 1.5. Any stopping time is an optional time but there are optional times which are not
stopping times.
T
Definition 1.20. For a continuous process Xt we define the natural filtration Ft := s&gt;t σ({Xr }r≤s )
Remark 1.6. For any right-continuous filtration we have that any optional time is a stopping time.
Theorem 1.13. For every s &gt; 0 we have that Xt := (Bt+s − Bs )t is a Brownian motion independent
of Fs .
Proof. Xt is clearly a Brownian motion by definition.
Fix ε &gt; 0 and consider Yε := (Bti +s+ε − Bs+ε )ni=1 . This is clearly independent from {Brj }m
j=1 for
rj &lt; s + ε and is therefore independent of σ({Br }r&lt;s+ε
).
T
From this we have that Yε is independent of Fs = ε&gt;0 σ({Br }r&lt;s+ε ) so by path continuity we have
that the almost sure limit is independent of Fs .
Corollary 1.4.
T Blumenthal’s 0-1 Law
If A ∈ F0 := ε&gt;0 σ({Bs }s&lt;ε ) then P(A) ∈ {0, 1}.
12
Proof. Bt = Bt − B0 a.s. so the σ-algebra generated by this process is independent of F0 and hence F0
is trivial.
Corollary 1.5. Let A be the event that in any interval [0, ε) we have that Bt attains both positive and
negative values. The P(A) = 1.
T
Proof. Let An := {∃ε &lt; 1/n : Bε &gt; 0} then n≥1 An is the event that for any such interval Bt attains
a positive value. An ⊂ An−1 so A0 := limn→∞ An exists and belongs to F0 .
For any N we have that


\
P
An  ≥ 1/2
0≤i≤N
since
nT
o
0≤i≤N
An ⊃ {B1/N &gt; 0} which has probability 1/2 by symmetry of a Brownian motion.
It therefore follows that P(A0 ) ≥ 1/2 but since A0 ∈ F0 this means we must have that P(A0 ) = 1 and
by symmetry we have P(A) = 1.
Proposition 1.1. Let Bt be a Brownian motion, then the following are martingales:
1. Bt
2. Bt2 − t
3. eλBt −
λ2 t
2
where λ &gt; 0
Proof. In each case adaptedness holds by definition and integrability holds by properties of Gaussian
random variables.
1.
E[Bt |Fs ] = E[Bt − Bs |Fs ] + Bs
= Bs
2.
E[Bt2 − t|Fs ] = E[Bt2 − Bs2 |Fs ] + Bs2 − t
= E[(Bt − Bs )2 + 2Bs (Bt − Bs )|Fs ] + Bs2 − t
= t − s + Bs2 − t
= Bs2 − s
3.
h
i
λ2 (t−s)
λ2 t
λ2 s
E eλBt − 2 |Fs = eλBs − 2 E eλ(Bt −Bs )− 2 |Fs
λ2 (t−s)
λ2 s
= eλBs − 2 E eλ(Bt −Bs )− 2
= eλBs −
λ2 s
2
by properties of moment generating functions.
13
Theorem 1.14. Let f : Rd → R by C 2 with bounded derivatives. Then
Z
1 t
∆f (Bs )ds
f (Bt ) −
2 0
is a martingale.
Proof. Adaptedness and integrability are trivial by the fact that f is C 2 with bounded derivatives.
Let
x2
e− 2t
pt (x) = √
2πt
be the Gaussian density.
E[f (Bt )|Fs ] = E[f (Bt − Bs + Bs )|Fs ]
Z
=
f (y)pt−s (y − Bs )dy
R
Z t
Z
Z
1 s
1 t
1
∆f (Br )dr|Fs =
∆f (Br )dr +
E[∆f (Br )|Fs ]dr
E
2
2 0
2 s
0
Z
Z Z
1 t
1 t
E[∆f (Br )|Fs ]dr =
∆f (y)pr−s (y − Bs )dydr
2 s
2 s R
Z Z
1 t
=
f (y)∆pr−s (y − Bs )dydr
2 s R
Z tZ
=
f (y)∂r pr−s (y − Bs )dydr
Zs R
=
f (y)pt−s (y − Bs )dy
R
= E[f (Bt )|Fs ] − f (Bs )
Z t
Z
1
1 s
E f (Bt ) −
∆f (Br )dr|Fs = E[f (Bt )|Fs ] −
∆f (Br )dr − E[f (Bt )|Fs ] + f (Bs )
2 0
2 0
Z
1 s
= f (Bs ) −
∆f (Br )drf (Bs )
2 0
1.5
Strong Markov Property and the Reflection Principle
Definition 1.21. Let T be a stopping time with respect to the filtration Ft . Then we can define a
filtration
FT := {A ∈ F∞ : A ∩ {T ≤ t} ∈ Ft ∀t ≥ 0}
Theorem 1.15. Strong Markov Property
Let Bt be a Brownian motion and T &lt; ∞ a.s. stopping time. Then Bt∗ := BT +t − BT is a Brownian
motion independent of FT .
Proof. ∀n we can define Tn := inf{k2−n &gt; T } ≥ T which is an increasing sequence of stopping times
converging almost surely to T since if t ∈ [k2−n , (k + 1)2−n ) then
{Tn ≤ t} = {T ≤ k2−n } ∈ Fk2−n ⊆ Ft .
14
(n)
:= BTn +t − BTn then for A ∈ C([0, ∞]) measurable and B ∈ FT we have that


X
(n)
(n)
E[χA (Bt )χB ] = E 
χA (Bt 0χB χ{Tn =k2−n } 
Write Bt
k≥0
=
X
E[χA (Bt+k2−n − Bk2−n )χB∩{Tn =k2−n } ]
k≥0
=
X
E[χA (Bt+k2−n − Bk2−n )]E[χB∩{Tn =k2−n } ]
by Markov property
k≥0
= E[χA (Bt )]
X
P(B ∩ {Tn = k2−n })
k≥0
= E[χA (Bt )]P(B)
hence we have the Strong Markov property for the increasing sequence of stopping times converging
a.s. to T (since taking B = Ω gives us that the process is a Brownian motion).
So we need to pass the result to the limit.
(n)
(n)
For any t1 , ..., tk we have that (Bt∗1 , ..., Bt∗k ) is the almost sure limit of (Bt1 , ..., Btk ) and hence by
path regularity and independence from FT we have that the limit in independent of FT so indeed the
theorem holds.
Corollary 1.6. inf{t ≥ 0 : Bt = sup0≤s≤T Bs } is not a stopping time.
(By strong Markov property and Blumenthal’s 0-1 law).
Theorem 1.16. Reflection Principle
If B is a Brownian motion, b ∈ R and T := inf{t ≥ 0 : Bt = b} then
(
Bt
t≤T
B̃t =
2b − Bt t &gt; T
is also a Brownian motion.
Corollary 1.7.
P( sup Bt ≥ b) = 2P(BS ≥ b) = P(|BS | ≥ b)
0≤t≤S
Proof. Let τ := inf{t : Bt = b}
{sup Bt ≥ b} = {BS ≥ b} ∪ ({BS &lt; b} ∩ {τ &lt; S})
t≤S
= {BS ≥ b} ∪ ({B̃S &gt; b} ∩ {τ &lt; S})
= {BS ≥ b} ∪ {B̃S ≥ b}
where all unions used are disjoint. Since B̃ is a Brownian motion we indeed have the desired result.
2
Levy Processes
Definition 2.1. Stochastic process Xt is called a cadlag process if it is right continuous and has left
limits a.s.
Definition 2.2. Stochastic process Xt is called stochastically continuous of ∀t, ε &gt; 0 we have that
lim sups→t P(|Xt − Xs | &gt; ε) = 0.
Definition 2.3. The Rd valued process (Xt )t≥0 is called a Levy Process if
15
1. Xt has independent increments;
2. X0 = 0 almost surely;
3. Xt has stationary increments;
4. Xt is stochastically continuous;
5. Xt is a cadlag process.
Definition 2.4. (Xt )t≥0 is a Poisson process of intensity λ &gt; 0 (written P P (λ)) if it is a Levy
Process and ∀t ≥ 0 we have that Xt ∼ P o(λt) i.e. P(Xt = k) =
(λt)k −λt
.
k! e
Definition 2.5. The Gamma distribution with intensity λ and shape c (written Γ(c, λ)) is the
distribution with density xc−1 e−λx λc /Γ(c).
Lemma 2.1. The Γ(c, λ) distribution has characteristic function
1−
ϕ(z) =
iz
λ
−c
Corollary 2.1. The Γ(1, λ) distribution is equivalent to exp(λ) and Γ(n, λ) is the sum of n i.i.d.
exp(λ) random variables.
Remark 2.1. We can construct the Poisson process as follows:
i.i.d.
Let {Ti }∞
i=1 ∼ exp(λ) so by the memoryless property we have that
P(Ti ≥ t + s|Ti ≥ t) =
e−λ(t+s)
= e−λs = P(Ti ≥ s)
e−λt
Pn
We then define Wn := i=1 Ti to be the waiting time for the nth event.
We can then define Xt = k where Wk ≤ t &lt; Wk+1 . Notice that Wn ∼ Γ(n, λ) and is independent of
Tn+1 so we have that
P(Xt = n) = P(Wn ≤ t, Tn+1 &gt; t − Wn )
Z t
Z ∞
λn
=
wn−1 e−λw
λe−λs dsdw
0 (n − 1)!
t−w
Z t
λn
wn−1 e−λw e−λ(t−w) dw
=
(n − 1)! 0
Z t
λn
−λt
=
e
wn−1 dw
(n − 1)!
0
λn −λt n
=
e t
n!
which is the p.d.f. of P o(λt).
Lemma 2.2. If Xt ∼ P P (λ) fix T &gt; 0, 0 = t0 &lt; t1 &lt; ... &lt; tn = T and {ki }ni=1 ∈ N0 . Let K =
then
!
k
n
n \
K! Y ti − ti−1 i
P
{Xti − Xti−1 = ki }XT = Qn
T
i=1 ki ! i=0
i=1
Pn
i=1
ki
Definition 2.6. An Rd valued process Xt is called a compound Poisson process of intensity λ &gt; 0 if it
is a Levy process and there exists a probability measure σ such that σ({0}) = 0 such that Xt has the
characteristic function
Z
eihz,xi − 1σ(dx)
ϕXt (z) = exp tλ
Rd
16
Lemma 2.3. We can construct
compound Poisson process by taking {ξi }∞
i=1 i.i.d. distributed
Pthe
n
according to σ, defining Sn := i=1 ξi , letting Nt ∼ P P (λ) independently from ξ and then setting
Xt := SNt .
Proof.
P(Xt0 ∈ B0 , Xt1 − Xt0 ∈ B1 ) =
X
P(Nt0 = n0 , Sn0 ∈ B0 , Nt1 − Nt0 = n1 , Sn1 − Sn0 ∈ B1 )
n0 ,n1 ∈N
=
X
P(Nt0 = n0 , Sn0 ∈ B0 )P(Nt1 − Nt0 = n1 , Sn1 − Sn0 ∈ B1 )
n0 ,n1 ∈N
=
X
X
P(Nt0 = n0 , Sn0 ∈ B0 )
n0 ∈N
P(Nt1 − Nt0 = n1 , Sn1 − Sn0 ∈ B1 )
n1 ∈N
= P(Xt0 ∈ B0 )P(Xt1 − Xt0 ∈ B1 )
= P(Xt0 ∈ B0 )P(Xt1 −t0 ∈ B1 )
X
ϕXt (z) =
E[exp(i hz, Xt i)χ{Nt =n} ]
n∈N
&quot;
=
X
*
E exp i z,
&quot;
=
*
E exp i z,
X
n∈N
ξi
#
χ{Nt =n}
n
X
+!#
ξi
P(Nt = n)
i=1
n∈N
=
+!
i=1
n∈N
X
n
X
σ̂(x)n
(λt)n e−λt
n!
= e−λt eλtσ̂(z)
= exp(λt(σ̂(z) − 1))
2.1
Infinitely Divisible Distributions
Definition 2.7. The probability measure &micro; on Rd is called infinitely divisible if ∀n ≥ 1 we have that
∃&micro;n probability measure such that &micro; = ∗(n) &micro;n . Where ∗(n) &micro;n denotes the n-fold convolution of &micro;n .
Lemma 2.4. If Xt is a Levy process then ∀t the law of Xt is infinitely divisible.
Pn
Proof. Xt = k=1 X kt − X (k−1)t by a telescoping sum. This is an i.i.d. collection hence indeed the
n
n
distribution of Xt can be given as an n-fold convolution of the distribution of X nt .
Lemma 2.5. If &micro;1 , &micro;2 are infinitely divisible then so is &micro;1 ∗ &micro;2 .
Corollary 2.2. Every distribution with a characteristic function of the form
Z
1
ϕX (z) = exp − hz, Σzi + iγz +
eihz,xi − 1ν(dx)
2
Rd
for ν a positive finite measure, Σ covariance matrix and &micro; ∈ Rd is infinitely divisible.
Proposition 2.1. If ϕ : Rd → C is a characteristic function of probability measure ν then
1. |ϕ(z)| ≤ 1;
2. |ϕ(0)| = 1;
3. ϕ is uniformly continuous;
17
4. ϕ is positive definite i.e. ∀ξ ∈ Cn
n X
n
X
ϕ(zk − zj )ξk ξ j ≥ 0
k=1 j=1
Proof. The first two points holds trivially and the third holds by dominated convergence theorem so it
remains to show the final statement.
Z X
Z X
n X
n
n
n
X
exp(i hx, zj i)ξ j ν(dx) ≥ 0
exp(i hx, zk − zj i)ξk ξ j ν(dx) =
exp(i hx, zk i)ξk
k=1 j=1
k=1
j=1
Theorem 2.1. Bochner
The properties
1. |ϕ(z)| ≤ 1;
2. |ϕ(0)| = 1;
3. ϕ is uniformly continuous;
4. ϕ is positive definite
uniquely characterise a characteristic function of a probability measure.
∞
Lemma 2.6. If &micro;, {&micro;n }∞
n=1 are probability measures and ϕ, {ϕn }n−1 are the corresponding
characteristic functions then TFAE
1. ϕn → ϕ pointwise;
2. &micro;n * &micro;.
Lemma 2.7. If &micro;n are probability measures and ϕn corresponding characteristic functions then if ϕn
converge pointwise to some ϕ continuous at 0 then ϕ is the characteristic function to some probability
measure &micro; to which &micro;n converge weakly.
Remark 2.2. In both of the previous lemmas the convergence of ϕn is locally uniform.
Lemma 2.8. If ϕ is the characteristic function of a probability measure &micro; then |ϕ|2 is also the
characteristic function of some probability measure ν.
Lemma 2.9. If &micro; is infinitely divisible and ϕ the corresponding characteristic function then ϕ doesn’t
attain the value 0.
Proof. For any n let &micro;n be the nth convolution root of &micro; and ϕn the corresponding characteristic
function. Then ϕnn = ϕ so |ϕn |2 = |ϕ|2/n is a characteristic function.
Since |ϕ| ≤ 1 we have that
(
0 ϕ(x) = 0
2/n
lim |ϕ(x)|
=
n→∞
1 o/w
In particular by continuity at the origin we have that limn→∞ |ϕ(x)|2/n = 1 on neighbourhood of the
origin.
The limit is a characteristic function of a probability measure hence is continuous everywhere so
indeed the limit is 1 everywhere hence ϕ 6= 0 anywhere.
d
Lemma 2.10. If ϕ : Rd → C \ {0} is continuous then ∃1 ψ, {λn }∞
n=1 : R → C \ {0} continuous such
that
1. ψ(0) = λn (0) = 1;
2. ϕ(z) = eψ(z) = λn (z)n .
18
Furthermore if ϕε converges locally uniformly to ϕ on C \ {0} then ψε , λn,ε converges locally uniformly
to ψ, λn .
Lemma 2.11. Let &micro;ε be infinitely divisible and converge to &micro;. Then &micro; is infinitely divisible.
Theorem 2.2. Let &micro;n be infinitely divisible with characteristic functions ϕn of the form
Z
1
ihz,xi
ϕn (z) = exp − hz, An zi + i hz, γn i +
e
− 1 − i hx, zi c(x)νn (dx)
2
Rd
where c is continuous, ≈ 1 around 0 around ≈ o(|x|−1 ) around ∞. Let &micro; be another probability
measure then &micro;n * &micro; if and only if
1. &micro; is infinitely divisible with the characteristic function
Z
1
ihz,xi
e
− 1 − i hx, zi c(x)ν(dx)
ϕ(z) = exp − hz, Azi + i hz, γi +
2
Rd
2. γn → γ in Rd ;
R
R
3. for all f continuous vanishing at the origin f (x)νn (dx) → f (x)ν(dx);
R
2
4. for ε &gt; 0 we have hz, An,ε zi = hz, An zi + |x|&lt;ε hx, zi νn (dx) satisfies
lim lim sup | hz, An,ε zi − hz, Azi | = 0
ε→0+
n→∞
Proof. (sketch) We only show that &micro;n * &micro; implies 1-4
Define ρn (dx) := (|x|2 ∧ 1)−1 νn (dx) then we can show that this collection of measures is tight by
showing
Z
Z
d
Y
sin(hx)
νn (dx) → 0 as h → 0
sup
ρn (dx) ≤ sup
1−
hx
d
n
n
|x|≥1/h
R
i=1
and
Z
ρn (dx) = − sup
sup
n
Z
Z
log(ϕn (z))dz ≤ C sup
n
Rd
(|x|2 ∧ 1)νn (dx) &lt; ∞
n
Then by Prokhorov ∃ρn * ρ weakly convergent subsequence. Defining ν := (|x|2 ∧ 1)−1 ρ.
Then for any continuous f vanishing at zero we have
Z
Z
Z
Z
2
−1
2
−1
f νn (dx) = (|x| ∧ 1) f ρn (dx) → (|x| ∧ 1) f ρ(dx) = f ν(dx)
We then have that
Z
log(ϕn (z)) = − hz, An zi + i hγn , zi +
eihz,xi − 1 − i hx, zi c(x)νn (dx)
= − hz, An,ε zi + i hγn , zi + In,ε + Jn,ε
where
Z
2
eihz,xi − 1 − i hx, zi + hx, zi νn (dx)
In,ε =
|x|≤ε
Z
eihz,xi − 1 − i hx, zi c(x)νn (dx)
Jn,ε =
|x|&gt;ε
By Taylor’s theorem limε→0+ lim supn→∞ In,ε = 0. For almost every ε
Z
Jn,ε →
eihz,xi − 1 − i hx, zi c(x)ν(dx)
|x|&gt;ε
19
R
hence limε→0+ lim supn→∞ In,ε = |x|&gt;0 eihz,xi − 1 − i hx, zi c(x)ν(dx).
We know that log(ϕn (z)) → log(ϕ(z)) hence it follows that hγn , zi → hγ, zi for some γ and
limε→0 limn→∞ hz, An,ε zi = hz, Azi.
Theorem 2.3. Levy-Khintchin Formula
If &micro; is infinitely
divisible then ∃Σ non-negative, symmetric matrix, γ ∈ Rd and σ-finite measure ν
R
satisfying Rd (|x|2 ∧ 1)ν(dx) &lt; ∞ such that
Z
1
ihz,xi
ϕX (z) = exp − hz, Σzi + i hγ, zi +
e
− 1 − i hz, xi χD (x)ν(dx)
2
Rd
where D = {|x| ≤ 1}.
Proof. Let &micro; be an arbitrary infinitely divisible distribution. We want to approximate &micro; by a
compound Poisson random variable.
(n) N (n)
Let &micro;n be the nth convolution root of &micro; and take {Xk }k=1 i.i.d. distributed according to &micro;n where
N ∼ P P (1).
PN (n) (n)
Define X (n) := k=1 Xk .
If we let β (n) denote the law of X (n) then the characteristic functions of β (n) have the correct form
and hence so does the limit so we want to show that β (n) converge weakly to &micro;.
Denote ϕ the characteristic function of &micro; and then ϕ1/n is the characteristic function of &micro;n .
ϕβ (n) (z) = exp n(ϕ1/n (z) − 1)
1
= exp n(exp( log(ϕ(z))) − 1)
n
1
n(exp( log(z)) − 1) → log(ϕ(z))
n
So indeed ϕβ (n) converges weakly to ϕ.
Remark 2.3.
R
1. I := Rd eihz,xi − 1 − i hz, xi χD (x)σ(dx) converges absolutely since
|eihz,xi − 1 − i hz, xi χD (x)| ≤ Cz (|x|2 ∧ 1)
2. Σ, γ, ν are unique and called the characteristic triple of &micro;
3. There is choice in the re-normalisation term −i hz, xi χD (x). In particular we may use
−i hz, xi c(x) where c ≈ 1 around 0 around c ≈ o(|x|−1 ) around ∞. Changes in c effect the drift
term γ only.
R
4. If a Levy measure satisfies |x| ∧ 1ν(dx) &lt; ∞ then we can choose c = 0 then I corresponds to
processes with paths with finite variance.
Corollary 2.3. Every infinitely divisible distribution is a weak limit of a compound Poisson process.
Theorem 2.4. Let A be a symmetric, non-negative matrix, γ ∈ Rd and ν a Levy measure. Then ∃X
Levy process such that X has an infinitely divisible distribution and characteristic triple (A, γ, ν).
Proof. Let Bt be a d-dimensional Brownian motion.
(1)
Define Xt := A1/2 Bt + γt to be a scaled Brownian motion with drift and ν (1) = ν|B(0,1)c which is a
c
finite measure since ν is finite on B(0, 1) .
(2)
Define Xt to be the compound Poisson process for the Levy measure ν (1) .
Let Ak := {x : 2−(k+1) &lt; |x| ≤ 2−k } and define X̃t3,k to be independent compound Poisson processes
with jump process ν|Ak .
20
R
We can then define Xt3,k := X̃t3,k − t xν(dx) to be a compensated compound Poisson process. Notice
R
that Xt3,k is a martingale since E[X̃t3,k ] = t xν(dx).
Pkl
3,i
is a martingale and by independence
For increasing sequence {ki }∞
i=1 we have that
i=k1 Xt

E
kl
X
!2 
Xt3,i
=
i=k1
So by Doob’s maximal inequality

E sup
s≤t
kl
X
E[Xt3,i ]
Z
=t
i=k1
kl
X
i=k1
!2 
Xt3,i
x2 ν(dx)
Skl
 ≤ 4t
i=k1
Z
Ak
x2 ν(dx)
Skl
i=k1
Ak
(1)
and so the sequence is Cauchy and hence converges so Xt := Xt
process with characteristic triple (A, γ, ν).
(2)
+ Xt
+
P∞
k=1
Xt3,k is a Levy
1
Example 2.1. Consider ν(dx) = θα |x|1+α
dx on R \ {0} which is a Levy measure for α ∈ (0, 2).
This has characteristic exponent
Z
1
−ψ(z) = −
e−xz − 1 − i hx, zi χ{|x|≤1} θα 1+α dx
|x|
R
Z
1
= θα |z|α (1 − cos(zx))
|z| dx
|zx|1+α
Z
1
= θα |z|α (1 − cos(x)) 1+α dx
|x|
choosing θα such that
R
1
(1 − cos(x)) |x|1+α
dx = 1 gives us that the characteristic function of the
α
infinitely divisible distribution with this Levy measure is e−|z| .
Pn
1
i.i.d.
d
This implies that if Xi ∼ &micro;α , Sn = i=1 Xi = n α X1 since
1
α
ϕSn (z) = e−n|z| = e−|n α z| = ϕX1 (z)
Definition 2.8. A probability measure &micro; on Rd is called strictly α-stable if for all n
ϕ&micro; (z)n = ϕ&micro; (n1/α z)
and α-stable if
ϕ&micro; (z)n = ϕ&micro; (n1/α z)eihcn ,zi
Theorem 2.5. The only non-trivial 2-stable distributions are Gaussian and the only non-trivial
1
α-stable distributions for α ∈ (0, 2) are infinitely divisible with Levy measure ν(dθ, dx) = π(dθ) rd+α
dr.
Definition 2.9. A subordinate is a Levy process on R which only takes non-negative values.
Remark 2.4. A subordinate has no Gaussian part, positive drift and ν integrates |x| ∧ 1.
Definition 2.10. The Laplace transform of a subordinate is
E[e−λXt ] = e−tφ(λ)
R
where the cummulant φ takes the form φ(λ) = λγ + 1 − eλx ν(dx).
Lemma 2.12. If Xt is a Levy process and Tt an independent subordinate the Yt := XTt is a Levy
process with
h
i
ϕY1 (z) = ET1 E[eihXT1 ,zi ] = e−φ(−ψ)
21
3
Markov Processes
Definition 3.1. A Markov kernel N on measure space (E, E) is a mapping N : E &times; E → [0, 1] such
that
1. For x ∈ E fixed we have A → N (x, A) is a probability measure.
2. For A ∈ E fixed we have x → N (x, A) is measurable.
Proposition 3.1. Let N be a Markov kernel then
1. N acts on non-negative functions f by
Z
N f (x) =
N (x, dy)f (y)
2. N acts on probability measures &micro; by
Z
&micro;N (A) =
N (x, A)&micro;(dx)
3. If M is another Markov kernel then
Z
N M (x, A) =
N (x, dy)M (y, A)
Definition 3.2. Let (Ω, F, P) be a probability space with filtration Ft and Ns,t be a family of Markov
kernels. Then a Markov process with transition kernel Ns,t is an adapted process Xt such that ∀t &gt; s
we have for all f non-negative, measurable
E[f (Xt )|Fs ] = Ns,t f (Xs )
Lemma 3.1. Chapman-Kolmogorov
For r &lt; s &lt; t we have Nr,t = Nr,s Ns,t
Proof. Let f be non-negative, measurable then
Nr,t f (Xr ) = E[f (Xt )|Fr ]
= E[E[f (Xt )|Fs ]|Fr ]
= E[Ns,t f (Xs )|Fr ]
= Nr,s Ns,t f (Xr )
Definition 3.3. A family of Markov kernels which satisfy the Chapman-Kolmogorov equations is
called a transition function.
Definition 3.4. A transition function Ns,t is called homogeneous if Ns,t = Ns+h,t+h for all h ≥ 0.
Theorem 3.1. Let x ∈ E and Ns,t a transition function. Then there exists a Markov process Xt for
Ns,t with X0 = x a.s.
Definition 3.5. Let B be a Banach space. A family Tt of bounded linear operators on B is called a C0
semi-group if
1. T0 = Id;
2. s, t ≥ 0 then Tt Ts = Tt+s ;
3. ∀x ∈ B we have limt→0 ||Tt x − x|| = 0.
22
Remark 3.1.
1. If B = Rn then semi-groups are of the form Tt = etA for some A ∈ Rn&times;n .
2. Intuitively a homogeneous transition family should give a semi-group on a space of measurable
functions by setting Tt f = N0,t f .
3. We denote C0 (E) to be the space of continuous functions on E which converges to 0 as x → ∞.
i.e. ∀ε &gt; 0 ∃Kε compact such that |f (x)| ≤ ε for x ∈ Kεc .
Definition 3.6. Let E be a locally compact metric space. A Feller semi-group is a C0 semi-group on
C0 (E) satisfying 0 ≤ f ≤ 1 =⇒ 0 ≤ Tt f ≤ 1.
Remark 3.2. It suffices that Tt f ∈ C0 for f ∈ C0 and Tt f to converge pointwise to f as t → 0.
Corollary 3.1. Every Levy process is a Feller process.
Proof. For Xt Levy we have that Tt f (x) = E[f (Xt + x)].
If xn → x then f (Xt + xn ) → f (Xt + x) almost surely hence by dominated convergence theorem we
have continuity of Tt f .
Similarly if xn → ∞ then f (Xt + xn ) → f (∞) almost surely for t → 0+ by cadlag paths we have
pointwise convergence at 0.
Theorem 3.2. Let Tt be a Feller semi-group and Xt the associated process. Then
1. Xt has a cadlag modification;
2. Xt satisfies Blumenthal’s 0 − 1 law;
3. Xt satisfies the strong Markov property.
Definition 3.7. The detailed balance equations for &micro;, K are
&micro;(x)K(x, y) = &micro;(y)K(y, x)
∀x, y ∈ E
Definition 3.8. The Markov chain Xn with transition kernel K(., .) on discrete state space E and &micro;
is a measure on E then Xn is reversible with respect to &micro; if the detailed balance equations hold.
Furthermore, if &micro; is a probability measure then &micro; is invariant for X.
We then say that &micro; is symmetric if for all t &gt; 0
Z
Z
f (x)Tt g(x)&micro;(dx) = g(x)Tt f (x)&micro;(dx)
Lemma 3.2. For discrete state spaces we have that the detailed balance equations holds if and only if
the measure is symmetric.
Lemma 3.3. A Levy process is reversible with respect to the Lesbesgue measure if and only if Xt has
the same distribution as −Xt
Proof. Let f, g be bounded with compact support then
Z
Z
Pt f (x)g(x)dx = E[f (x + Xt )]g(x)dx
Z
=E
f (x + Xt )g(x)dx
Z
=E
f (x)g(x − Xt )dx
Z
= f (x)E[g(x − Xt )]dx
Z
= f (x)P̃t g(x)dx
where P̃t is the transition semi-group of −Xt .
23
Definition 3.9. Let Pt be a Feller semi-group on C0 (E). The generator (A, DA ) of Pt is the operator
1
DA := {f ∈ C0 : lim+ (Pt f − f ) exists in C0 }
t→0 t
1
Af := lim+ (Pt f − f )
t→0 t
Example 3.1. Let Xt be a compound Poisson process with jump ν and intensity λ. Then
DA = C0 (E) and Af = λ(f ∗ ν − f ).
Proposition 3.2. If f ∈ DA then
1. Pt f ∈ DA ∀t ≥ 0;
2.
∂
∂t Pt f
= APt f = Pt Af ;
Rt
Rt
3. Pt f − f = 0 Ps Af ds = 0 APs f ds.
Definition 3.10. We say that an operator A is closed if whenever fn → f ∈ DA and gn := Afn → g
then we have that g = Af .
Proposition 3.3. For the generator (DA , A) we have that
1. DA is dense in C0 (E);
2. A is a closed operator.
Proof. Let f ∈ C0 (E) and define
1
(Ph f − f )
h
Z
1 s
Bs f =
Pt f dt
s 0
Ah =
Then
Z s
Z s
1
Pt+h f dt −
Pt f dt
s 0
0
!
Z s+h
Z h
1
Pt f dt −
Pt f dt
=
sh
s
0
Ah B s f =
1
h
which converges to 1s (Ps f − f ) as h → 0 so Bs f ∈ DA for any s and Bs f → f as s → 0 so indeed DA
is densely defined.
It remains to show that A is closed so
Pt f − f = Pt lim f − lim f
n→∞
1
(Pt f − f ) − g t
∞
n→∞
= lim (Pt fn − fn )
n→∞
Z t
= lim
Ps Afn ds
n→∞ 0
Z t
= lim
Ps gn ds
n→∞ 0
Z t
=
Ps g ds
0
Z t
1
≤ Ps g − g ds
t 0
∞
Z
1 t
≤
||Ps g − g||∞ ds
t 0
which converges to 0 by continuity.
24
Theorem 3.3. Let Xt be a Feller process with semi-group Pt and generator A then for any f ∈ DA
Z t
Mft := f (Xt ) − f (X0 ) −
Af (Xs ) ds
0
is a martingale with respect to the natural filtration.
Proof. Mft is bounded and adapted by definition.
Z t
f
f
Af (Xr )dr|Fs
E[Mt |Fs ] = Ms + E f (Xt ) − f (Xs ) −
s
= Mfs + Pt−s f (Xs ) − f (Xs ) −
Z
t
Pr−s Af (Xs )dr
s
= Mfs
Remark 3.3. Conversely,
if for a given function f ∈ C0 there exists g ∈ C0 such that
Rt
f (Xt ) − f (X0 ) − 0 g(Xr )dr is a martingale then f ∈ DA and g = Af .
Theorem 3.4. Positive Maximum Principle
If f ∈ DA , z0 ∈ E such that f (z0 ) = supz∈E f (z) ≥ 0 then Af (z0 ) ≤ 0.
Example 3.2. For a Levy process with semi-group Pt on Schwarz functions we have that
FAf (ξ) = ψ(−ξ)Ff (ξ) and in general we have
Z
n
n
X
X
Af (x) =
Ai,j δi,j f (x) −
bi δi f (x) +
f (x + y) − f (x) − xf 0 (x)χD ν(dx)
i,j=1
Rn
i=1
Remark 3.4. If Pt is symmetric then the generator is self-adjoint on L2 (ν).
We will often replace A by its quadratic form
1
E(f ) := −(f, Af )L2 (ν) = − lim+ ((Pt f, f )L2 (ν) − (f, f )L2 (ν) )
t→0 t
P
1
2
Lemma 3.4. On a discrete space E(f ) = limt→0+ 2t
x,y∈X (f (y) − f (x)) Pt (x, y)ν(x)
Proof.


X
1 X
f (x)Pt (x, y)f (y)ν(x) −
f (x)2 ν(x)
E(f ) = − lim+
t→0 t
x,y∈X
x∈X


X
1  X
= − lim+
f (x)Pt (x, y)(f (y) − f (x))ν(x) +
f (x)Pt (x, y)(f (y) − f (x))ν(x)
t→0 2t
x,y∈X
x,y∈X


X
1  X
= − lim+
f (x)Pt (x, y)(f (y) − f (x))ν(x) −
f (y)Pt (y, x)(f (y) − f (x))ν(y)
t→0 2t
x,y∈X
x,y∈X


X
1  X
= − lim+
f (x)Pt (x, y)(f (y) − f (x))ν(x) −
f (y)Pt (x, y)(f (y) − f (x))ν(x)
t→0 2t
x,y∈X
x,y∈X


1  X
= − lim+
(f (x) − f (y))Pt (x, y)(f (y) − f (x))ν(x)
t→0 2t
x,y∈X


1  X
= lim+
(f (x) − f (y))2 Pt (x, y)ν(x)
t→0 2t
x,y∈X
25
Definition 3.11. A closed densely defined form E on L2 (ν) for ν-σ-finite is called Markovian if
∀f ∈ DE we have
1. g = (f ∧ 0) ∨ 1 ∈ DE ;
2. E(g) ≤ E(f ).
A quadratic form with these properties is called a Dirichlet form.
3.1
Random Conductance Model
Definition 3.12. Consider the locally finite, connected graph (X, E) where ∀(x,
Py) ∈ E we have a
conductance &micro;x,y = &micro;y,x ≥ 0 representing the flow between x, y and then &micro;x := y∈X &micro;x,y is the flow
out of x.
We write x ∼ y if (x, y) ∈ E and define the Dirichlet form as
E(f ) =
E(f, g) =
1X
(f (x) − f (y))2 &micro;x,y
2 x∼y
1X
(f (x) − f (y))(g(x) − g(y))&micro;x,y
2 x∼y
and the potential as
Lf (x) =
X &micro;x,y
(f (y) − f (x))
&micro;x
y∼x
Lemma 3.5. Fix x0 ∈ X then ||f ||H 1 := E(f ) + f (x0 )2 is a norm on the Hilbert space
H 1 := {f : E(f ) &lt; ∞}
P
Lemma 3.6. E(f ) ≤ 2 x f (x)2 &micro;x
Proof.
1X
(f (x) − f (y))2 &micro;x,y
2 x∼y
X
≤
(f (x)2 + f (y)2 )&micro;x,y
E(f ) =
x∼y
≤2
X
f (x)2 &micro;x
x
Lemma 3.7. For f, g ∈ L2 (&micro;) we have
−(Lf, g)L2 (&micro;) = E(f, g) = (f, −Lg)L2 (&micro;)
Proof.
−
X X &micro;x,y
1X
(f (x) − f (y))g(x)&micro;x =
&micro;x,y (f (y) − f (x))(g(y) − g(x))
&micro;x
2 x∼y
x y∼x
26
Remark 3.5. The natural choice of the Laplacian is given by
X
L̃f (x) =
&micro;x,y (f (y) − f (x))
y∼x
which is self-adjoint with respect to the counting measure.
Definition 3.13. We can define the following three random walks using this set-up
1. Discrete time random walk
&micro;
P(Yn+1 = y|Yn = x) = &micro;x,y
x
2. Constant speed random walk
Let Yn be the discrete time random walk, Nt ∼ P P (1) then we define Xt = YNt
3. Variable speed random walk
Let Yn be the discrete time random walk and define Xt to have departure rate &micro;x from x.
Proposition 3.4. For A ⊂ X and f : A → X bounded let σA = inf{n ≥ 0 : Yn ∈ A} and
ϕ(x) := Ex [f (YσA )| σA &lt; ∞] is a solution to
(
Lv = 0 x ∈ Ac
v|A = f
and if σA &lt; ∞ a.s. then it is the unique bounded solution.
Proof. By the Markov property we have that
X &micro;x,y
Ex [f (YσA )|σA &lt; ∞] =
Ey [f (YσA )|σA &lt; ∞]
&micro;x
x∼y
X &micro;x,y
ϕ(y)
=
&micro;x
y∼x
So indeed Lϕ = 0 on Ac .
Definition 3.14. For A, B ⊂ X : A ∩ B = φ we define effective resistance to be
Ref f (A, B)−1 = inf{E(f, f )| f ∈ H 1 , f |A = 1, f |B = 0}
Proposition 3.5. For effective resistance Ref f and A, B ⊂ X disjoint we have
1. Ref f is symmetric;
2. Ref f is monotonic i.e. for A ⊂ A0 , B ⊂ B 0 with A0 ∩ B 0 = φ we have Ref f (A0 , B 0 ) ≥ Ref f (A, B);
3. Cutting bonds (&micro;x,y → 0) increases Ref f ;
4. Shortening bonds (&micro;x,y → ∞) decreases Ref f .
Proposition 3.6. If Ref f 6= 0 then the infimum is attained by a unique minimiser ϕ solving
Lϕ(x) = 0 ∀x ∈ X \ (A ∪ B) with ϕ|A = 1, ϕ|B = 0.
Proof. Take x0 ∈ B then H 1 with the norm E(f, f ) + f (x0 )2 is a Hilbert space and
V = {f ∈ H 1 ; f |A = 1, f |B = 0} is convex and closed so ∃1 ϕ ∈ V of minimum norm.
Let f satisfy f |A = 0 = f |B then E(f + λϕ) + λ2 E(ϕ, ϕ) + 2λE(f, ϕ) ≥ E(f ) ∀λ ≥ 0.
We therefore have that E(f, ϕ) = 0 =⇒ −(f, Lϕ) = 0 but since f is arbitrary we have that
Lϕ = 0.
Theorem 3.5. If Ac is finite then for x0 ∈ Ac we have that Ref f (x0 , A)−1 = &micro;x0 P(σA &lt; σx+0 ) where
σx+0 = inf{n ≥ 1 : Xn = x0 }.
27
Proof. v(x) = Px (σA &lt; σx0 ) which is a unique solution to the Dirichlet problem with
v(x0 ) = 0, ν|A = 1, Lv(x) = 0 ∀x ∈ Ac \ {x0 } hence we have that
Ref f (x0 , A)−1 = E(v, v)
X
=
(v(x) − v(y))(v(x) − v(y))&micro;x,y
x∼y
=
X
(v(y) − v(x))((1 − v(x)) − (1 − v(y)))&micro;x,y
x∼y
= E(−v, 1 − v)
= (Lv, 1 − v)
XX
=
(v(x) − v(y))((1 − v)(x))&micro;x,y
x∈X y∼x
=
X X
(v(x) − v(y))((1 − v)(x))&micro;x,y
x∈Ac y∼x
=
X
((1 − v)(x))
x∈Ac
=
X
X
(v(x) − v(y))&micro;x,y
y∼x
((1 − v)(x))Lv(x)&micro;x
x∈Ac
= &micro;x0 Lv(x0 )
X &micro;x ,y
0
Py (σA &lt; σx0 )
= &micro;x0
&micro;
x
0
y∼x
0
= &micro;x0 Px0 (σA &lt; σx+0 )
Remark 3.6. If An is an increasing sequence of finite sets converging to X then we define
Ref f (x0 ) = lim Ref f ({x0 }, Acn )
n→∞
which exists and is independent of the choice of An .
Theorem 3.6. ∀x we have that
Px (σx+ = ∞) = (&micro;x Ref f (x))−1
Proof.
Px (σx+ = ∞) = lim Px (σAn &lt; σx+ )
n→∞
Ref f (x, Acn )−1
n→∞
&micro;x
= (&micro;x Ref f (x))−1
= lim
Definition 3.15. For an infinite connected graph X a Markov chain is recurrent if for all x we have
that P(σx+ &lt; ∞) = 1 and transient otherwise.
Corollary 3.2. It suffices that P(σx+ &lt; ∞) = 1 for some x and transience is equivalent to
Ref f (x) &lt; ∞.
28
3.2
Heat Kernel Estimates
Definition 3.16. For x, y ∈ X and n ≥ 1 define
pn (x, y) = Px (Yn = y)/&micro;y
and for t ≥ 0
pt (x, y) =
∞
X
e−t
n=0
tn
pn (x, y)
n!
d
Remark 3.7. For Gaussian distribution on R we have
|x − y|2
1
exp
−
pt (x, y) =
2t
(2πt)d/2
Proposition 3.7. Let E, Pt , &micro; be the quadratic form, semi-group and density as before. Then
1. ||Pt f ||L1 (&micro;) ≤ ||f ||L1 (&micro;) for f ∈ L1 ∩ L2 .
2. Let u(t) = ||Pt f ||2L2 (&micro;) then we have u0 (t) = −2E(Pt f, Pt f ).
||Pt f ||2
2E(f,f )
≤ ||f ||2 L2 ≤ 1.
3. exp − ||f ||2
L2 (&micro;)
L2
Proof.
1. Using that Pt is self-adjoint
||Pt f ||L1 = (Pt f, 1)L2
= (f, Pt 1)L2
≤ ||f ||L1
2.
1
((Pt+h f, Pt+h f ) − (Pt f, Pt f ))
h
1
= lim (Pt+h f + Pt f, Pt+h f − Pt f )
h→0 h
1
= lim (Pt+h f + Pt f, (Ph f − I)Pt f )
h→0
h
= (2Pt f, LPt f )
u0 (t) = lim
h→0
= −2E(Pt f, Pt f )
3. Assume ||f ||L2 = 1
||Pt f || is log-convex since
||P t+s f ||2L2 = (P t+s f, P t+s f )
2
2
2
= (Pt f, Ps f )
≤ ||Pt f ||2L2 ||Ps f ||2L2
so
d
dt
log(||Pt f ||2L2 ) is non-increasing, in particular
d
1
log(||Pt f ||2L2 ) =
∂t ||Pt f ||2L2
dt
||Pt f ||2L2
E(Pt f, Pt f )
= −2
||Pt f ||2L2
29
Definition 3.17. We say E satisfies the Nash θ-inequality if ∃c1 , c2 &gt; 0 such that ∀f ∈ H 1 ∩ L1 we
have that
4
2+ 4
||f ||L2 θ ≤ c1 (E(f, f ) + δ||f ||2L2 )||f ||Lθ 1
Theorem 3.7. For E, Pt , &micro; quadratic form, semi-group and density as usual we have that E satisfies
the Nash θ-inequality if and only if Pt (L1 ) ⊂ L∞ and
θ
||Pt ||L1 →L∞ ≤ c2 eδt t− 2
Proof. Assume Nash holds.
WLOG let ||f ||L1 = 1 then we want a bound on u(t) = ||Pt f ||2L2 .
4
∂t u(t) = −2E(Pt f, Pt f ) ≤ −c|u|2+ θ
by Nash
4
4
Define v(t) = u(t)1−4/θ then ∂t v(t) ≥ cu−2− θ −1 u2+ θ = c.
v(0) = 0 co v(t) ≥ ct and hence u(t) ≤ ct−θ/2 so
||Pt f ||L1 →L∞ ≤ ||Pt/2 f ||L1 →L2 ||Pt/2 f ||L2 →L∞
≤ ct−θ/2
Now assume the other statement holds.
Assume ||f ||L1 = 1
||Pt f ||2L2
2E(f, f )t
exp −
≤
||f ||2L2
||f ||2L2
||Pt f ||L∞ ||Pt f ||L1
≤
||f ||2L2
≤c
E(f, f ) ≥
t−θ/2
||f ||2L2
1 log(tθ/2 ||f ||2L2 ) − log(c) ||f ||2L2
2t
Optimising over t gives the required result.
Example 3.3. If ϕ is an eigenfunction of −L with eigenvalue λ ≥ 1 then
||ϕ||L∞ ≤ λθ/4 ||ϕ||L2
Definition 3.18. For Ω ⊂ X finite we define the principle eigenvalue as
E(f, f )
λ1 (Ω) = inf
:
supp(f
)
⊂
Ω
||f ||2L2
Definition 3.19. E satisfies the Faber-Krahn inequality if
λ1 (Ω) ≥ C&micro;(Ω)−2/θ
Theorem 3.8. The Nash θ inequality holds if and only if the Faber-Krahn inequality holds.
Proof. Suppose the Nash inequality holds.
4
4/θ
||f ||2+ θ ≤ cE(f, f )||f ||L1
30
4
hence dividing through by ||f || θ we have
2
||f || ≤ cE(f, f )
||f ||L1
||f ||L2
4/θ
Suppose supp(f ) ⊂ Ω then ||f ||L1 ≤ &micro;(Ω)1/2 ||f ||L2 so
||f ||2L2 ≤ cE(f, f )&micro;(Ω)2/θ
so indeed we have that
E(f, f )
≥ c&micro;(Ω)−2/θ
||f ||2L2
Now suppose that The Faber-Krahn inequality holds.
Z
2
Z
2
u d&micro; =
Z
u2 d&micro;
Z
(u − λ)2 d&micro; + 2λ ud&micro;
u d&micro; +
{u≥2λ}
Z
≤4
{u&lt;2λ}
{u≥2λ}
≤ &micro;(Ω)2/θ E((u − λ)+ , (u − λ)+ ) + 2λ||u||L1
2/θ
||u||
E(u, u) + 2λ||u||L1
≤c
λ
Optimising over λ givers the required result.
Definition 3.20. For θ &gt; 2 we say that E satisfies the Sobolev inequality if ∃c &gt; 0 such that for all f
with finite support we have that
||f ||2 2θ ≤ cE(f, f )
L θ−2
Corollary 3.3. For θ &gt; 2 if the Sobolev inequality holds then so does the Nash θ inequality.
Lemma 3.8.
1. If inf x∼y &micro;x,y &gt; 0 then Ref f (x, y) ≤ cd(x, y).
2. |f (x) − f (y)|2 ≤ Ref f E(f, f ).
−1
3. Ref f , Ref
f are both metrics.
Proof. ∃ a path x0 , x1 , ..., xn such that (xi , xi+1 ) ∈ E for all i, x0 = x, xn = y and n = d(x, y).
Cut all edges except for this path then since the resisters are in series we have that the resistances add
and by ellipticity the first result follows.
Suppose u(x) 6= u(y) then write f = au + b for a, b ∈ R such that f (x) = 1, f (y) = 0.
We then have that
|f (x) − f (y)|2
|u(x) − u(y)|2
: u ∈ H 1 , u(x) 6= u(y) = sup
: f (x) = 1, f (y) = 0 = Ref f (x, y)
sup
E(u, u)
E(f, f )
so indeed the second result follows.
The third result follows by showing the properties of a metric directly.
31
3.3
Green Densities
Definition 3.21. For a stochastic process Y we define the local time to be L(y, n) =
Pn−1
k=1
χ{Yk =y} .
Definition 3.22. For B ⊂ X finite we define the green function to be
gB (x, y) =
1
Ex [L(y, τB )]
&micro;y
where τB = inf{n : Yn ∈ B c }.
Proposition 3.8. For the green function gB we have that gB (x, y) = gB (y, x) ≤ gB (x, x) for all
x, y ∈ B.
Proposition 3.9.
1. gB (x, .) is an harmonic function on B \ {x};
2. if supp(f ) ⊂ B then E(gB (x, .), f ) = f (x);
P
3. E[τB ] = y∈B gB (x, y)&micro;y ;
4. Ref f (x, B c ) = gB (x, x).
Proof. Let ν(z) = gB (x, z)
1. For x 6= y we have that
&quot;τ −1
#
B
X
ν(y)&micro;y = E
χ{Yi =y}
i=1
=E
&quot;τ −2
B
X
X
i=1 z∈B
&micro;z,y
χ{Yi =z}
&micro;z
#
X &micro;z,y
=
ν(z)&micro;y
&micro;z
z∈B
We therefore have that
ν(y) =
X
p(z, y)ν(z)
z∈B
2.
gB (x, x)&micro;x − 1 = Ex [L(x, τB )] − 1
X
=
p(x, y)E[L(x, τB )]
y∈B
= gB (y, x)&micro;x
E(f, ν) = −(f, Lν)
= −f (x)(−&micro;−1
x )&micro;x
= f (x)
Remark 3.8. The first two parts of proposition 3.9 show that LgB (x, .) = δx (.).
Lemma 3.9. Let B(x, r) = {y : d(x, y) ≤ r} be the ball of radius r centred at x, V (x, r) = &micro;(B(x, r))
the volume of the ball, Ω ⊂ X finite, non-empty and r(Ω) = max(r ∈ N : ∃x0 ∈ Ω, B(x0 , r) ⊂ Ω}. Then
λ1 (Ω) ≥
c
r(Ω)&micro;(Ω)
32
Proof. Take f with supp(f ) ⊂ Ω and normalize such that ||f ||∞ = 1. Fix x0 ∈ Ω where |f (x0 )| = 1
then ||f ||2L2 ≤ &micro;(Ω).
c
∃x0 , x1 , ..., xn path with xn ∈ Ωc , {xi }n−1
i=0 ⊂ Ω and n = d(x0 , Ω ) then
E(f, f ) ≥
n−1
1X
(f (xi ) − f (xi+1 ))2
2 i=0
≥
!2
n−1
X
1
≥
2n
(f (xi ) − f (xi+1 ))
i=1
1
2n
hence since n ≤ r(Ω) we have that
E(f, f )
1
≥
||f ||2L2
r(Ω)&micro;(Ω)
Lemma 3.10.
p2t (x, x) ≥
Px (τB &gt; t)2
&micro;(Ω)
Proof.
Px (τB &gt; t)2 ≤ Px (Yt ∈ B)2

2
X
=
pt (x, y)&micro;y 
y∈B
≤ &micro;(B)
X
pt (x, y)2 &micro;2y
Cauchy-Schwarz
y∈B
= &micro;(B)p2t (x, x)
reversibility
Proposition 3.10. Let fn (0, x) = pn+1 (0, x) + pn (0, x) and assume Ref f (0, y) ≤ cd(0, y)α then
D
rD
− D+α
fn (0) ≤ cn
c∨
V (0, r)
where n = 2rD+α .
33
```