Chapter 3 Measure Theory 3.2. Let Ω = N = {1, 2, . . .} denote the numerals. For all n = 1, 2, . . . define Fn to be the σ-algebra generated by {1}, . . . , {n}. For example, F1 = {∅, N, {1}, N \ {1}}, F2 = {∅, N, {1}, {2}, {1, 2}, N \ {1}, N \ {2}, N \ {1, 2}}, and so on. Evidently, Fn ⊂ Fn+1 . However, {1, 3, 5, . . .} �∈ ∪∞ i=1 Fi . 3.3. For this exercise, we need Cantor’s countable axiom of choice: A countable union of countable sets is itself countable. Let F denote the σ-algebra generated by all singletons in R. Obviously, {x} ∈ F for all x ∈ R. It remains to show that [a, b] �∈ F for any a � b. Define G = {G ⊆ R : either G or Gc is denumerable}. [Recall that “denumerable” means “at most countable.”] You can check directly that G is a σ-algebra. We claim that F = G. This would prove that [a, b] �∈ F, because neither [a, b] nor its complement are denumerable. Let A denote the algebra generated by all singletons. If E ∈ A, then either E is a finite collection of singletons, or Ec is. This proves that A ⊆ G. The monotone class theorem proves that σ(A) ⊆ G, but σ(A) = F. Therefore, it remains to prove that G ⊆ F. But it is manifest that G ⊂ σ(A). 3.15. No, A need not be an algebra. For instance, let µ be the counting measure on positive integers. Then E ∈ A and (Dµ)(E) = 1/2, where E denotes all even integers. Now let F be the collection of all positive integers m such that: (i) If 2k < m � 2k+1 for some even integer k � 0 then m is even; (ii) else m is odd. Then F ∈ A and (Dµ)(F) = 1/2, but E ∩ F �∈ A. Let µ be counting measure on positive integers and A the set of all even 1 integers. Evidently, � (Dµ)(A) = 2 . However, (Dµ)({x}) = 0 for all x so that (Dµ)(A) �= x even integer (Dµ)({x}). Therefore, Dµ is not countably additive on A. 7 8 CHAPTER 3. MEASURE THEORY Chapter 4 Integration 4.1. There are two things to prove here: (i) σ(X) is σ-algebra; and (ii) it is the smallest one with respect to which X is measurable. As regards (i), we check that ∅ ∈ σ(X) because X−1 (∅) = ∅. Also if A ∈ σ(X) then A = X−1 (B) for some B ∈ A. But then Ac = (X−1 (B))c , which is in σ(X). Finally, suppose A1 , A2 , . . . are all in σ(X). Then we −1 can find B1 , B2 , . . . such that Ai = X−1 (Bi ). Evidently, ∪∞ (Bi ) = i=1 X −1 ∞ ∞ X (∪i=1 Bi ). Because ∪i=1 Bi ∈ A (the latter is after all a σ-algebra), it ∞ −1 follows that ∪∞ (Bi ) ∈ σ(X). We have proved that σ(X) is i=1 Ai = ∪i=1 X a σ-algebra. Note that X is measurable with respect to a σ-algebra G iff X−1 (B) ∈ G for all B ∈ A. Therefore, a priori, X is measurable with respect to σ(X), and any other G must contain σ(X). 4.4. If A1 , A2 , . . . are disjoint and measurable then so are f−1 (A1 ), f−1 (A2 ), . . ., ∞ −1 and f−1 (∪∞ (An ). The rest is easy sailing. n=1 An ) = ∪n=1 f 4.6. [We need µ and ν to be finite measures in this exercise. One can also extend it to Radon measures; i.e., those that are finite on compact sets.] First, let us prove this for k = 1: For all closed intervals [a, b], we can find continuous functions fn ↓ 1[a,b] . By the monotone convergence theorem, µ([a, b]) = ν([a, b]). Therefore, µ and ν agree on the algebra generated by closed intervals. Because this algebra genarates the Borel σ-algebra B(R), the Carathéodory’s extension theorem prove that µ = ν. In order to carry this program out when k > 1, we approximate the function 1[a1 ,b1 ]×···×[ak ,bk ] (x1 , . . . , xk ) = 1[a1 ,b1 ] (x1 ) × · · · × 1[ak ,bk ] (xk ) by continuous functions of the form f1n (x1 ) × · · · × fkn (xk ). Then proceed as in the k = 1 case, using the fact that hyper-cubes of the form [a1 , b1 ] × · · · × [ak , bk ] generate B(Rk ). 4.7. Let X be a random variable that takes the values x1 , . . . , xk with respective � � i probabilities p1 , . . . , pk . Then, E[X] = ki=1 xi pi , whereas ln ki=1 xp i = 9 10 CHAPTER 4. INTEGRATION �k i=1 pi ln xi = E[ln X]. Thus, we are asked to prove that exp(E[ln X]) � E[X], but this follows from the convexity of the function ψ(x) = − ln(x) and Jensen’s inequality. pk 1 Suppose we have shown that whenever v1 p1 + · · · + vk pk = vp 1 · · · vk then v1 = · · · = vk . We will next prove that the same holds for k + 1 variables. Indeed, suppose pk+1 1 x1 p1 + . . . xk+1 pk+1 = xp 1 · · · xk+1 . (4.1) There is nothing to prove unless one of the xi ’s is non-zero. In that case, we can relabel and assume without loss of generality that xk+1 �= 0. Because pk+1 = 1 − (p1 + · · · + pk ), the preceding display is equivalent to �p1 � �pk � xk x1 ··· ·xk+1 . (x1 −xk+1 )p1 +· · ·+(xk −xk+1 )pk +xk+1 = xk+1 xk+1 Divide by xk+1 to obtain pk 1 1 + p1 (y1 − 1) + · · · + pk (yk − 1) = yp 1 · · · yk , � where yi = xi /xk+1 . Because i pi = 1, the left-hand side is equal to �k i=1 yi pi . Thus, we have proved that if (4.1) with k + 1 variables has a non-trivial solution, then (4.1) has a solution with k variables. So it 1 p2 suffices to prove that if v1 p1 + v2 p2 = vp 1 v2 then v1 = v2 . Suppose without, loss of generality, that v2 � v1 . So we can write v2 = v1 + r where r � 0. We need to prove that r = 0. Our eq. (4.1) with 2 variables 1 p2 is then written as v1 p1 + (v1 + r)p2 = vp 1 (v1 + r) . If v1 = 0, then the only solution is r = 0. Else, v1 + rp2 = v1 (1 + vr1 )p2 , which is the same as 1 + vr1 p2 = (1 + vr1 )p2 . Proving that r = 0 is the same as proving that w = (r/v1 ) = 0. But w satisfies 1 + wp2 = (1 + w)p2 . To prove that w = 0, it suffices to check that the function h(z) = (1+z)p2 −1−zp2 has a unique maximum at z = 0, and the value of the maximum is zero. But h � (z) = p2 (1 + z)p2 −1 − p2 is zero iff z = 0, and h �� (z) = p2 (p2 − 1)(1 + z)p2 −2 < 0. Because h(0) = 0, we are done. 4.9. I will do this in the case that f �� is a continuous function. [A modification of this proof works when f �� is Riemann integrable.] Let z = λx + (1 − λ)y. Then according to Taylor expansions, 1 f(x) = f(z) + (x − z)f � (z) + (x − z)2 f �� (z) 2 1 = f(λx + (1 − λ)y) + (1 − λ)(x − y)f � (λx + (1 − λ)y) + (1 − λ)2 (x − y)2 f �� (ζ), 2 where ζ is between λx + (1 − λ)y and x. Similarly, 1 f(y) = f(z) + (y − z)f � (z) + (y − z)2 f �� (z) 2 1 = f(λx + (1 − λ)y) − λ(x − y)f � (λx + (1 − λ)y) + λ2 (x − y)2 f �� (ξ), 2 11 where ξ is between λx + (1 − λ)y and y. Therefore, in particular, λf(x)+(1−λ)f(y) = f(λx+(1−λ)y)+ � � λ(1 − λ) (x−y)2 (1−λ)f �� (ζ)+λf �� (ξ) . 2 The fact that f �� � 0 finishes the proof. The remainder is direct computation. 4.11. First suppose 2f((a + b)/2) � f(a) + f(b) and f is continuous. We aim to prove that f is convex. Let λ = 1/4 and note that λa+(1−λ)b = (b+c)/2, where c := (a + b)/2. Therefore, � � � � 3 1 1 1 1 f(a) + f(b) 1 3 1 a + b � f(b)+ f(c) � f(b)+ = f(a)+ f(b). f 4 4 2 2 2 2 2 4 4 Because we can exchange the roles of a and b as well, this proves that for all dyadic rational λ ∈ [0 , 1] and all a, b, f(λa + (1 − λ)b) � λf(a) + (1 − λ)f(b). By continuity, this must holds for all λ ∈ [0 , 1]. It remains to prove that if f is convex then it is continuous. But the proof of Jensen’s inequality shows that for all compact sets K ⊂ R, � � � f(b) − f(a) � � < ∞. L(K) := sup �� b−a � a,b∈K a<b Thus, supa,b∈K:|a−b|�� |f(b) − f(a)| � �L(K) for all � > 0. This proves continuity.