THE UNIVERSITY OF HONG KONG DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE STAT3911 Financial Economics II Chapter 1 Probability Theory 1.1 Probability Space Ω – Sample space A ⊆ Ω, A is called a subset of Ω. A, B ⊆ Ω, we can define A ∪ B = {ω : ω ∈ A or ω ∈ B} A ∩ B = {ω : ω ∈ A and ω ∈ B} A \ B = {ω : ω ∈ A but ω∈B} A = Ω \ A = {ω : ω∈A} If A ∩ B = φ, A and B are called disjoint AB A B AB DeMorgan Law: A∪B = A∩B A ∩ B = A ∪ B. Let F = {A : A ⊆ Ω} be a collection of subsets of Ω. If F satisfies: (i) Ω ∈ F (ii) A ∈ F ⇒ A ∈ F (iii) A1 , . . . , An ∈ F ⇒ n S Ai ∈ F, i=1 1 2021−22 F is called a field. If (iii) is replaced by (iv) A1 , . . . , An , · · · ∈ F ⇒ ∞ S Ai ∈ F, then F is called a σ-field. i=1 Remarks: 1. (iv) ⇒ (iii), that is a σ-field is a field. If (iv) is true, for A1 , . . . , An ∈ F, from (i) + (ii), we have φ = Ω ∈ F, ⇒ A1 , . . . , An , φ, · · · ∈ F Let An+1 = An+2 = · · · = φ, A1 , . . . , An , An+1 · · · ∈ F S S n ∞ n ∞ S S Ai = Ai = Ai ∪ φ ∈F ⇒ i=1 i=1 i=1 i=n+1 2. A1 , A2 ∈ F ⇒ A1 ∩ A2 ∈ F (ii) (∵ A1 , A2 ∈ F =⇒ A1 , A2 ∈ F (iii) or (iv) =⇒ (ii) A1 ∪ A2 ∈ F =⇒ A1 ∪ A2 ∈ F DeMorgan law =⇒ A1 ∪ A2 = A1 ∩ A2 ∈ F). More general, A1 , . . . , An · · · ∈ F ⇒ ∞ T Ai ∈ F. i=1 3. A, B ∈ F ⇒ A \ B = A ∩ B ∈ F. σ-field generated by a collection of subsets: If G = {A : A ⊆ Ω} is a collection of subsets of Ω (G is not necessarily a σ-field or a field), we let F = σ(G) denote the smallest σ-field generated by G. This means that F is a σ-field, and G ⊆ F, any σ-field M satisfies M ⊇ G, we have F ⊆ M. Examples of σ-field: (1) F = {φ, Ω} is a σ-field. (2) F = all subsets of Ω is a σ-field. (3) A ⊆ Ω, F = {φ, Ω, A, A} = σ{A} is a σ-field (which is the σ-field generated by set A). 2 (4) If A1 , A2 , A3 ⊆ Ω, A1 ∪ A2 ∪ A3 = Ω and A1 , A2 , A3 disjoint, then F = σ{A1 , A2 , A3 } = {φ, Ω, A1 , A2 , A3 , A1 ∪ A2 , A1 ∪ A3 , A2 ∪ A3 } is a σ-field. R is the real line. Let G = {(a, b], a, b ∈ R} then B = σ{G} is called Borel σ-field. A ∈ B is called a Borel set. (a1 , b1 ], . . . , (an , bn ], · · · ∈ B ⇒ ∞ S (ai , bi ] ∈ B, i=1 ∞ T i=1 (ai , bi ] ∈ B i=1 In particular, ⇒ ∞ T (ai , bi ∈ R) ai = a − 1i , bi = a a − 1i , a = {a} ∈ B ⇒ [a, b] = (a, b] ∪ {a} ∈ B ⇒ (a, b) = (a, b]\{b} ∈ B Borel σ-field contains most of the sets we can think of. Suppose Ω is a sample space, F is a σ-field contains all events, we define P : F → [0, 1]. P is a probability if it satisfies: (i) 0 ≤ P (A) ≤ 1, ∀ A ∈ F (ii) P (φ) = 0 and P (Ω) = 1 (iii) If A1 , A2 , · · · ∈ F, disjoint ⇒ P ∞ S ∞ P Ai = P (Ai ) i=1 i=1 Some properties of probability: (1) A, B ∈ F, A ⊆ B ⇒ P (A) ≤ P (B) (iii) (∵ A and B\A are disjoint =⇒ P (B) = P (A ∪ (B\A)) = P (A) + P (B\A) ≥ P (A)). (2) A, B ∈ F, A ⊆ B ⇒ P (B\A) = P (B) − P (A). (3) P (A) = 1 − P (A) (Let B = Ω in (2)). 3 (4) ∀ A, B ∈ F ⇒ P (A) + P (B) = P (A ∪ B) + P (A ∩ B). (∵ A ∪ B = (A\(A ∩ B)) ∪ (B\(A ∩ B)) ∪ (A ∩ B), A\(A ∩ B), B\(A ∩ B) and A ∩ B are disjoint, A ⊇ A ∩ B and B ⊇ A ∩ B ⇒ P (A ∪ B) = P (A\(A ∩ B)) + P (B\(A ∩ B)) + P (A ∩ B) = P (A) − P (A ∩ B) + P (B) − P (A ∩ B) + P (A ∩ B) = P (A) + P (B) − P (A ∩ B)). By induction, we have A1 , . . . , An ∈ F ⇒ P n [ Ak = X k=1 X P (Ai ∩ Aj ) i<j k X + P (Ak ) − P (Ai ∩ Aj ∩ Ak ) + · · · + (−1)n+1 P (A1 ∩ · · · ∩ An ). i<j<k (5) A1 , . . . , An ∈ F, P S n Ai ≤ i=1 n P P (Ai ). i=1 Proof: Let B1 = A1 , B2 = A2 ∩ A1 = A2 \A1 , . . . Bk = Ak ∩ Ak−1 ∩ · · · A1 , ⇒ B1 , B2 , . . . , Bk , . . . , Bn disjoint. Bi ⊆ Ai ∀ i, n [ Ak = k=1 ⇒ P n [ n [ Bk k=1 n n [ X Ai = P Bi = P (Bi ) i=1 n X ≤ i=1 P (Ai ) i=1 ∵ Bi ⊆ Ai ⇒ P (Bi ) ≤ P (Ai ) . i=1 (6) An , A ∈ F, An % A as n → ∞ ⇒ P (An ) → P (A). Proof: Let B1 = A1 , B2 = A2 \A1 , . . . , Bk = Ak \Ak−1 . ⇒ ∵ Bk disjoint. Ak % ⇒ An = n [ i=1 4 Ai = n [ i=1 Bi A= ∞ S ∞ S Ai = i=1 Bi i=1 P (A) = P ∞ [ Ai = P i=1 n X = lim n→∞ = ∞ [ Bi = i=1 P (Bi ) = lim P n→∞ i=1 ∞ X i=1 n [ P (Bi ) Bi i=1 lim P (An ). n→∞ (7) An , A ∈ F, An & A as n → ∞ ⇒ P (An ) → P (A). An & A Proof: ∵ From 6. ⇒ An % A. P (An ) → P (A) = 1 − P (A) ⇒ = 1 − P (An ) ⇒ P (An ) → P (A). S P ∞ ∞ (8) An ∈ F, P Ai ≤ P (Ai ) i=1 i=1 Proof: From 5, P S n P n Ai ≤ P (Ai ), ∀ n. i=1 i=1 n [ ∵ ∞ [ % Ai i=1 (6) ⇒ P nS n Ai o → P S ∞ i=1 ∵ Ai as n → ∞. i=1 Ai as n → ∞. i=1 P (Ai ) ≥ 0 ⇒ n X P (Ai ) ≤ i=1 ⇒ P ∞ n[ ∞ X i=1 ∞ o X Ai ≤ P (Ai ) i=1 i=1 —·— (Ω, F, P ) is called a probability space. 5 P (Ai ) for ∀ n Let (Ω, F, P ) be a prob. space. If A ∈ F satisfies P (A) = 1, we say that the event A occurs almost surely. EX. 1 (Lebesgue measure on [0, 1]) Let Ω = [0, 1]. Define P [a, b] = b − a, 0≤a≤b≤1 (?) This prob. measure is called Lebesgue measure. If a = b [a, b] = {a} P {a} = a − a = 0 =⇒ P (a, b) = P [a, b] = b − a 0 ≤ a ≤ b ≤ 1. There are many other subsets of [0, 1] whose prob. is determined by (?), e.g. 1 1 2 2 P 0, = P 0, ,1 ∪ ,1 +P 3 3 3 3 2 = . 3 F = B[0, 1] = Borel σ-field on [0, 1] = σ{(a, b]; a, b ∈ [0, 1]} It is easy to show that P is a prob. measure. Let A = all rational numbers in [0, 1] 1 1 2 1 3 A = 0, 1, , , , , , · · · 2 3 3 4 4 = {r1 , r2 , · · · } (∞ ) ∞ [ X P {A} = P ri = P {ri } = 0. i=1 i=1 P {Ω\A} = P {all irrational numbers in [0, 1]} = 1 − 0 = 1. 1.2 Random variables and distributions X: If X Ω → R: ∀B∈B ( : X −1 (B) ∈ F, where ) 0 X 1 X −1 (B) = {ω : X(ω) ∈ B} then we say that X is a random variable. In mathematical words, X is a r.v. if X is a 6 measurable map from Ω to R. Similarly f : R → R. If ∀ B ∈ B, f −1 (B) ∈ B, we say that f is a measurable function. The reason we require X is measurable is that (−∞, x] ∈ B for ∀ x, if we want to define the distribution function, we need P {X ≤ x} = P {ω : X(ω) ≤ x} to be define for any x ∈ R, i.e. {ω : X(ω) ≤ x} ∈ F. Simple random variable: X takes only finite # of values, say x1 , . . . , xn . In this case, X is a r.v. ⇐⇒ {ω : X(ω) = xi } ∈ F, i.e. X −1 ({xi }) ∈ F, Proof: (⇒) ∵ i = 1, . . . , n. i = 1, . . . , n. {xi } ∈ B ⇒ X −1 ({xi }) ∈ F since X is a r.v. (⇐) ∀ B ∈ B, if xi ∈B, i = 1, . . . , n ⇒ X −1 (B) = φ ∈ F. Note that X −1 {xi1 } ∪ · · · ∪ {xik } = X −1 {xi1 } ∪ X −1 {xi2 } ∪ · · · ∪ X −1 {xik } ∈ F. ⇒ If B contains some {xi }, we also have X −1 (B) ∈ F ⇒ X is a r.v. Two problems (exercises): (1) Let A = {r : r is rational, 0 ≤ r ≤ 1} B = {x : x ∈ [0, 1], x is irrational}. Show that A ∈ B and B ∈ B. (2) If X is a r.v. and f is a measurable function, show that f (X) is also a r.v. Remarks: Random variable is defined with respect to a σ-field. (a) If F = all subsets of Ω, ⇒ any map X : Ω → R is a r.v. (b) If F = {φ, Ω}, X : Ω → R is a r.v. ⇔ X ≡ c a constant. 7 (c) F = σ{A} = {φ, Ω, A, A}. X : Ω → R is a r.v. ⇔ ( a1 ω∈A X(ω) = a2 ω∈A i.e. X is a two points r.v. (4) F = σ{A1 , . . . , An }, n S Ai = Ω, Ai disjoint, X : Ω → R is a r.v. ⇔ i=1 a1 .. X(ω) = . a ω ∈ A1 .. . ω ∈ An n i.e. X can take at most n points. Examples: 1 ,1 2 (1) Ω = [0, 1], F1 = {φ, Ω, 0, 21 , X : Ω → R: ( X(ω) = ⇒ 0 0≤ω≤ 1 1 2 1 2 <ω≤1 X is a r.v. with respect to F1 . (2) Ω = [0, 1], F2 = σ 1 , 1 2 1 0, 4 , 1 1 , 4 2 , 1 ,1 2 Y: Ω→R Y (ω) = = φ, Ω, 0, 14 , 0 1 2 0≤ω≤ 1 4 1 2 <ω≤ 1 1 , 4 2 , 1 ,1 2 1 , 0, 2 , 1 ,1 4 1 , 0, 4 ∪ 1 4 1 2 < ω ≤ 1. ⇒ Y is a r.v. with respect to F2 (or F2 r.v.), but Y is NOT a F1 r.v. X is a r.v. ⇒ {X ≤ x} = {ω : X(ω) ≤ x} ∈ F for ∀ x ∈ R. Let F (x) = P {X ≤ x}. We call F (x) the distribution function of X. If F 0 (x) = f (x) continuous function, we say that X is a continuous r.v. and f (x) is called a density function. If F (x) is a step function, we say X is a discrete r.v. pi = f (xi ) = P (X = xi ) is called a probability function. 8 For B ∈ B, let P {X ∈ B} = µX (B). µX (B) is called the distribution measure of X. Let X be a r.v. on a prob. space (Ω, F, P ). The distribution measure of X is the prob. measure µX that assigns to each Borel subset B of R the mass µX (B) = P {X ∈ B}. EX. Ω = [0, 1]. P – Lebesgue measure. X(ω) = ω and Y (ω) = 1 − ω, ω ∈ [0, 1] are two r.vs. µX [a, b] = P {ω : a ≤ X(ω) ≤ b} = P {[a, b]} = b − a. µY [a, b] = P {ω : a ≤ Y (ω) ≤ b} = P {ω : a ≤ 1 − ω ≤ b} = P [1 − b, 1 − a] = 1 − a − 1 + b = b − a = µX [a, b]. a, b ∈ [0, 1]. If we define another prob. measure Pe on [0, 1] by Z Pe[a, b] = b 2ωdω = b2 − a2 a 0≤a≤b≤. (it is easy to check that Pe satisfies Pe[0, 1] = 1 and the countable additivity). Denote the distribution measure of X under Pe by µ eX , we have µ eX [a, b] = Pe{ω : a ≤ X(ω) ≤ b} = Pe[a, b] = b2 − a2 , 0 ≤ a ≤ b ≤ 1. The distribution of Y µ eY [a, b] = Pe{ω : a ≤ Y (ω) ≤ b} = Pe{ω : a ≤ 1 − ω ≤ b} = Pe[1 − b, 1 − a] = (1 − a)2 − (1 − b)2 , 6= µ eX [a, b]. 9 0≤a≤b≤1 The distribution function of X (c.d.f.) F (x) = P {X ≤ x} = µX (−∞, x]. If we know F , µX (x, y] = F (y) − F (x). ∵ =⇒ ∞ \ 1 [a, b] = a − ,b n n=1 1 µX [a, b] = lim µX a − , b n→∞ n 1 1 = lim F (b) − F a − = F (b) − lim F a − n→∞ n→∞ n n = F (b) − F (a−). If ∃ f (x) ≥ 0 such that b Z µX [a, b] = P {a ≤ X ≤ b} = f (x)dx a f (x) is called a density function. If ∃ x1 , . . . , xN or x1 , x2 , . . . such that with prob. 1 the r.v. X takes one of the values in the sequence. pi = P {X = xi } ≥ 0, X µX (B) = pi , X pi = 1. B ∈ B. {i,xi ∈B} EX. Uniform distribution, f (x) = 1, 0 ≤ x ≤ 1. Standard normal distribution X2 1 ϕ(x) = √ e− 2 2π c.d.f. Z x N (x) = ϕ(ξ)dξ −∞ N (x) ∵ % =⇒ N (N −1 (y)) = y ∀ y ∈ (0, 1) Let Y be a uniformly distributed r.v. Let X = N −1 (Y ) µX [a.b] = P {ω : a ≤ X(ω) ≤ b} = P {ω : a ≤ N −1 (Y ) ≤ b} = P {ω : N (a) ≤ Y ≤ N (b)} Z b = N (b) − N (a) = ϕ(x)dx. a =⇒ X ∼ N (0, 1). 10 1.3 Expectations Riemann integral: f (x) is a continuous function on [a, b]. Π = {x0 , x1 , . . . , xn }, where a = x0 < x1 < · · · < xn = b is a partition set of [a, b]. kΠk = max {xk − xk−1 }. 1≤k≤n Let Mk = max xk−1 ≤x≤xk f (x) and mk = min xk−1 ≤x≤xk f (x). The upper Riemann Sum is RSΠ+ (f ) = n X Mk (xk − xk−1 ) k=1 the lower Riemann Sum is RSΠ− (f ) = n X mk (xk − xk−1 ) k=1 If lim RSΠ+ (f ) = kΠk→0 Z We call it lim RSΠ− (f ) kΠk→0 b f (x)dx. This is the Riemann integral. a y y a x1 x2 f (x) b x Lebesgue integral: Consider f (x) = 1 if x is rational . 0 if x is irrational For any partition 0 = x0 < x1 < · · · < xn = 1, we have Mk = max xk−1 ≤x≤xk f (x) = 1 and mk = 11 min xk−1 ≤x≤xk f (x) = 0. RSΠ+ (f ) =⇒ n X = n X Mk (xk − xk−1 ) = (xk − xk−1 ) = 1 k=1 k=1 n X RSΠ− (f ) = and mk (xk − xk−1 ) = k=1 n X 0 = 0. k=1 Since this is true for ∀ partition Π =⇒ lim RSΠ+ (f ) = 1 6= 0 = lim RSΠ− (f ) kΠk→0 =⇒ kΠk→0 f (x) is not Riemann integrable. Lebesgue integral is defined by partition the y-axis rather than x-axis. Assume f (x) ≥ 0. Let Π = {y0 , y1 , y2 , . . .}, where 0 = y0 < y1 · · · yn < · · · . Ak = {x : yk ≤ f (x) < yk+1 }. Define the lower Lebesgue sum as LSΠ− (f ) 10 X = yk L(Ak ) k=1 Z and define the Lebesgue integral f (x)dx = lim LSΠ− (f ). kΠk→0 [a,b] y4 y3 y2 y1 A3 A1 y1 0 A2 Note that Z we can replace the x-axis by Ω and the Lebesgue measure by prob. measure to define X(ω)dP (ω), where X is a r.v. Ω Assume 0 ≤ X(ω) < ∞ ∀ ω ∈ Ω, and let Π = {y0 , y1 , . . .}, where 0 = y0 < y1 < · · · . Let Ak = {ω ∈ Ω; yk ≤ X(ω) < yk+1 }. The lower Lebesgue sum be LSΠ− (X) = ∞ X k=1 12 yk P (Ak ) kΠk = the maximal distance between the yl partition points. Define Z X(ω)dP (ω) = lim LSΠ− (X). kΠk→0 Ω If P (X ≥ 0) = 1 but P (X = ∞) > 0. We define Z X(ω)dP (ω) = ∞. Ω X − r.v. X+ = max{X; 0}, X = X+ − X− X − = max{−X(ω), 0} and |X| = X + + X − X + and X − nonnegative r.vs. Z Z Z + X(ω)dP (ω) = X (ω)dP (ω) − X − (ω)dP (ω). Ω Ω If Z Ω Z + X (ω)dP (ω) and Ω X − (ω)dP (ω) Ω are both finite, we say that X is integrable. If Z X + (ω)dP (ω) = ∞ and Z Ω X − (ω)dP (ω) < ∞ Ω then Z X(ω)dP (ω) = ∞, Ω if Z X + (ω)dP (ω) < ∞ Ω and Z Z − X(ω)dP (ω) = −∞. X (ω)dP (ω) = ∞ then Ω If both Z Ω Z + X (ω)dP (ω) = ∞ and Ω X − (ω)dP (ω) = ∞ Ω Z then X(ω)dP (ω) is not defined. Ω We define for ∀ A ∈ F Z Z X(ω)dP (ω) = A where IA (ω)X(ω)dP (ω) Ω 1 IA (ω) = 0 if ω ∈ A otherwise. 13 If A ∩ B = φ, A, B ∈ F, IA∪B = IA + IB =⇒ Z Z Z X(ω)dP (ω) = X(ω)dP (ω) + X(ω)dP (ω). A∪B A B Definition Let X be a r.v. on a prob. space (Ω, F, P ). The expectation of X is defined to be Z X(ω)dP (ω). EX = Ω This definition makes sense if X is integrable, i.e. if Z |X(ω)|dP (ω) < ∞. E|X| = Ω or if X ≥ 0 a.s. In the latter case EX might be = ∞. Theorem: Let X be a r.v. on a prob. space (Ω, F, P ), (i) If X takes only finitely may values y0 , . . . , yn , then Z X(ω)dP (ω) = E(X) = Ω n X yk P {X = yk }. k=0 (ii) If X is a continuous random variable, then Z Z ∞ Z ydFX (y) = X(ω)dP (ω) = E(X) = yfX (y)dy. −∞ −∞ Ω ∞ (iii) (Integrability) The r.v. X is integrable ⇐⇒ Z |X(ω)|dP (ω) < ∞. E|X| = Ω Now let Y be another r.v. on (Ω, F, P ). Z (iv) (Comparison) If X ≤ Y almost surely (i.e. P {X ≤ Y } = 1) and if Z Y (ω)dP (ω) are defined, then X(ω)dP (ω) and Ω Ω Z Z X(ω)dP (ω) ≤ EX = Y (ω)dP (ω) = EY. Ω Ω In particular if X = Y a.s. and one of the integrals is defined, then they are both defined and Z EX = Z X(ω)dP (ω) = Ω Y (ω)dP (ω) = EY. Ω 14 (v) (Linearity) If α and β are real constants and X and Y are integrable, or if α and β are nonnegative constants and X and Y are nonnegative then Z E(αX + βY ) = (α X(ω) + β Y (ω)) dP (ω) ΩZ Z = α X(ω)dP (ω) + β Y (ω)dP (ω) = α EX + β EY. Ω Ω (vi) (Jensen’s inequality) If ϕ is a convex, real-valued function defined on R, and if E|X| < ∞, then ϕ(EX) ≤ Eϕ(X). Pf. Let `(x) = ax + b be a supporting line through (E(X), ϕ(E(X))) – a line lying entirely under the graph of ϕ (see the figure). M (x) M ( E ( X )) "( x ) E (x) 15 ax b Then a X(ω) + b ≤ ϕ(X(ω)). =⇒ a EX + b ≤ E[ϕ(X)) but a EX + b = ϕ(E(X)) =⇒ ϕ(E(X)) ≤ E ϕ(X). Theorem (Comparison of Riemann and Lebesgue integrals) Let f be a bounded function on [a, b], (or R). Z b f (x)dx is defined ⇐⇒ the set of points x in [a, b] where f (x) (i) The Riemann integral a is not continuous has Lebesgue measure 0. Z b f (x)dx is defined, then f is Borel measurable, the Lebesgue (ii) If the Riemann integral a Z integral f (x)dx is also defined, and the Riemann and Lebesgue integrals agree. [a,b] Suppose B ∈ B, (B ⊆ R) and L(B) = 0. If a property is held for all x ∈ R except x ∈ B, we say that the property holds almost everywhere. Z b f (x)dx exists ⇐⇒ f (x) is almost everywhere continuous on [a, b]. Riemann integral a 1.4 Convergence of integrals Let X1 , X2 , X3 , . . . be a sequence of r.vs, all defined on the same prob. space (Ω, F, P ). Let X be another r.v. defined on (Ω, F, P ). We say that X1 , X2 , . . . converges to X almost surely lim Xn = X n→∞ if ∃ A ∈ F, P (A) = 0, ∀ ω ∈ Ω\A a.s. X1 (ω), X2 (ω), · · · −→ X(ω). EX Xn i.i.d. Xn = 1 0 Yn = n P 1 2 1 2 Xk . The strong low of large number =⇒ k=1 Yn 1 = n→∞ n 2 lim a.s. Let f1 , f2 · · · be a sequence of real-valued measurable functions defined on R. Let f be another real-valued measurable function defined on R. We say that f1 , f2 · · · convergences to f almost everywhere lim fn = f a.e. if ∃ B ∈ B such that L(B) = 0 n→∞ and for ∀ x ∈ R\B, lim fn (x) = f (x). n→∞ 16 EX r fn (x) = n − nx2 e 2 2π density function of N 1 0, n . If x 6= 0, fn (x) −→ 0 (n → ∞) r n = ∞ n→∞ n→∞ 2π ( 0 fn (x) −→ f ∗ (x) = ∞ lim fn (0) = =⇒ lim fn (x) −→ f (x) = 0 if x 6= 0 if x = 0 a.e. However Z ∞ Z ∞ fn (x)dx −→ 6 −∞ f (x)dx = 0 −∞ = 1. This says that convergence a.e. does not imply we can exchange the order of limit and integration. i.e. Z ∞ 1 = lim fn (x)dx n→∞ ∞ Z 6= lim fn (x)dx = 0 n→∞ Z−∞ ∞ ∗ −∞ f (x)dx = −∞ ∗ Z ∗ ∞ f ∗ (x)dx f (x) = 2 f (x) =⇒ 2 ∵ −∞ Z ∞ 2f ∗ (x)dx = = Z ∞ f ∗ (x)dx =⇒ Z f ∗ (x)dx = 0 −∞ −∞ −∞ ! ∞ Theorem (Monotone convergence) Let X1 , X2 , . . . be a sequence of r.vs converging a.s. to another r.v X if 0 ≤ X1 ≤ X2 ≤ · · · a.s. then lim EXn = E lim Xn = EX. n→∞ n→∞ Let f1 , f2 , . . . be a sequence of measurable functions on R converging a.e. to a function f . If 0 ≤ f1 ≤ f2 ≤ · · · then Z ∞ lim n→∞ Z fn (x)dx = −∞ a.e. ∞ Z ∞ lim fn (x)dx = −∞ n→∞ 17 f (x)dx. −∞ Another condition which guarantees the limit of the integrals of a sequence of functions is the integral of the limiting function is given in the following theorem. Theorem (Dominated convergence) Let X1 , X2 , . . . be a sequence of r.vs converging a.s. to a r.v. X. If ∃ another r.v. Y such that EY < ∞ and |Xn | ≤ Y a.s. for every n, then lim EXn = E lim Xn = EX. n→∞ n→∞ Let f1 , fx · · · be a sequence of measurable functions on R converging a.e. to a function f . If Z ∞ g(x)dx < ∞ and |fn | ≤ g a.e. for every n, then ∃ another function g such that −∞ Z ∞ lim n→∞ Z fn (x)dx = −∞ ∞ lim fn (x)dx n→∞ Z−∞ ∞ = f (x)dx. −∞ Note that if no Y exist such that |Xn | ≤ Y and EY < ∞, you may cannot exchange the limit and integration. EX. (Ω, F, P ), where Ω = [0, 1], F = Borel σ-field on [0, 1]. P = Lebesgue measure X1 (ω) = 1 ( X2 (ω) = ∀ ω ∈ [0, 1] = Ω ω ∈ 0, 21 ω ∈ 21 , 1 0 2 .. . ( Xn (ω) = 0 2n−1 1 ω ∈ 0, 1 − 2n−1 1 ω ∈ 1 − 2n−1 ,1 .. . E Xn = 1 ∀ n, =⇒ =⇒ lim E Xn n→∞ Xn → 0 a.s. = 1 = 6 0 = E lim Xn n→∞ = E(0). 1.5 Computation of expectations Let X be a r.v., and g be a measurable function, then Z Eg(X) = g(x)dµX (x) Z = g(x)dFX (x) 18 Z (= g(x)fX (x)dx if X has density function fX (x)), when Eg(X) exists. 1.6 Change of measure Theorem Let (Ω, F, P ) be a prob. space and let Z ≥ 0 a.s. be a r.v. with EZ = 1. For ∀ A ∈ F, define Z Pe(A) = Z(ω)dP (ω). A Then Pe is a prob. measure. Furthermore, if X is a r.v. and X ≥ 0, then e E[X] = E[XZ]. If Z > 0 a.s., we also have e Y EY = E Z for every r.v. Y ≥ 0. Remarks: Z X(ω)dPe(ω)). e is the expectation under the prob. measure Pe (i.e. EX e = (1) E Ω (2) ∵ e = E(X + Z) − E[X − Z) as long as the substraction does X = X + − X − , so EX not result in an ∞ − ∞. Pf. ∀ A ∈ F. Z Z(ω)dP (ω) ≥ 0 Pe(A) = ZA Pe(Ω) = Z(ω)dP (ω) = EZ = 1. Ω Let A1 , A2 , . . . be a sequence of disjoint sets in F, define Bn = n S k=1 ∵ IB1 ≤ IB2 ≤ · · · and 19 lim IBn = IB∞ , n→∞ Ak , B∞ = ∞ S k=1 Ak , =⇒ = = = = Z e P (B∞ ) = IB∞ (ω)Z(ω)dP (ω) Ω Z lim IBn (ω)Z(ω)dP (ω) by Monotone convergence Theorem) n→∞ lim n→∞ lim n→∞ lim n→∞ Ω Z X n Ω k=1 n XZ IAk (ω)Z(ω)dP (ω) Ω k=1 n X IAk (ω)Z(ω)dP (ω) Pe(Ak ) = k=1 ∞ X Pe(Ak ) k=1 =⇒ Pe is a prob. measure. Suppose X ≥ 0 is a r.v. If X = IA , A ∈ F, then Z e e EX = P (A) = IA (ω)Z(ω)dP = E[IA Z] Ω = E(XZ). If X = n P ci IAi Ai ∈ F, disjoint i=1 " e e EX = E n X # ci IAi = = e A] ci E[I i i=1 i=1 n X n X " ci E[IAi Z] = E i=n n X # ci IAi Z i=1 = E[XZ]. n P For general X ≥ 0, ∃ Xn = ci IAi , Xn ≤ X and lim Xn = X a.s. by dominated n→∞ i=1 convergence theorem, e EX = e n = lim EX n→∞ lim E[Xn Z] n→∞ = E[XZ]. When Z > 0 a.s., Y /Z is defined and we may replace X by Y /Z in the above to obtain e /Z]. EY = E[Y Two prob. measures P and Pe on (Ω, F) are said to be equivalent if P (A) = 0 ⇐⇒ Pe(A) = 0, A ∈ F. Note that if Z > 0 a.s. then P and Pe are equivalent, since if A ∈ F, P (A) = 0 =⇒ Pe(A) = Z 1 IA (ω)Z(ω)dP (ω) = 0, and on the other hand, if B ∈ F, Pe(B) = 0 =⇒ IB = 0 a.s. Z Ω e 1 IB = 0. =⇒ P (B) = E[IB ] = E Z 20 EX Ω = [0, 1], P – Lebesgue measure and Z b Pe[a, b] = 2ωdω = b2 − a2 , 0≤a≤b≤1 2ωdP (ω) P (dω) = dω) a Z = b (∵ a =⇒ Z(ω) = 2ω > 0 a.s. =⇒ ∀ r.v. X ≥ 0, we have Z 1 1 Z X(ω)dPe(ω) = X(ω)2ωdω. 0 0 This suggest the notation dPe(ω) = 2ωdω = 2ωdP (ω). In general, we may write Z Z X(ω)Z(ω)dP (ω), X(ω)dPe(ω) = ZΩ Ω Z Y (ω)dP (ω) = Ω Ω Y (ω) e dP (ω), Z(ω) and Z(ω) = dPe(ω) dP (ω) Z is called the Radon-Nikodym derivative of Pe with respect to P . EX Let X be a r.v.. such that Z µX (B) = P (X ∈ B) = ϕ(x)dx ∀ B ∈ B. B where x2 1 ϕ(x) = √ e− 2 , 2π i.e. X is a standard normal r.v. Let Y = X + θ, θ is a constant, =⇒ Y ∼ N (θ, 1). Define 1 Z(ω) = exp −θX(ω) − θ2 2 Since Z(ω) > 0, ∀ ω ∈ Ω, 21 and EZ = 1. Z ∵ ∞ EZ = 2 e −θx− θ2 −∞ Z ∞ = −∞ =⇒ x2 1 1 · √ e− 2 dx = √ 2π 2π ! Z ∞ e− (x+θ)2 2 dx −∞ y2 1 √ e− 2 dy = 1 . 2π dPe = Z(ω) gives an equivalent prob. measure. Under Pe, dP Z Pe{Y ≤ b} = Z(ω)dP (ω) {ω:Y (ω)≤b} Z = I{ω:Y (ω)≤b} Z(ω)dP (ω) ZΩ I{ω:X(ω)≤b−θ} Z(ω)dP (ω) = Ω Z 1 2 = I{X(ω)≤b−θ} exp −θX(ω) − θ dP (ω) 2 Ω Z 1 2 = I{x≤b−θ} e−θx− 2 θ ϕ(x)dx Z b−θ 1 2 x2 1 e−θx− 2 θ − 2 dx = √ 2π −∞ Z b−θ Z b 1 2 1 1 − 12 (x+θ)2 = √ e dx = √ e− 2 y dy 2π −∞ 2π −∞ Z i.e. b Pe(Y ≤ b) = −∞ =⇒ 1 2 1 √ e− 2 y dy 2π Y ∼ N (0, 1) under Pe. Theorem: (Radon-Nikodym) Let P and Pe be equivalent probability measures defined on (Ω, F). Then ∃ a r.v. Z > 0 a.s. such that EZ = 1 and Z Z(ω)dP (ω) ∀ A ∈ F. Pe(A) = A 22