EXPECTATION 1. Expectation Definition 1.1. The expectation, or expected value, or mean value of X with mass function f is defined by X E[X] = xf (x). x whenever the sum is absolutely convergent. Proposition 1.1. If X has mass function f and g : R → R, then X E(g(X)) = g(x)f (x), x whenever this sum is absolutely convergent. Theorem 1.1. We have (1) (2) (3) (4) If X ≥ 0, then E[X] ≥ 0, if a, b ∈ R, then E[aX + bY ] = aE[X] + bE[Y ], E[1{A}] = P [A], E[1] = 1. Proposition 1.2. If X and Y are independent, then E[XY ] = E[X]E[Y ]. 2. Variance Definition 2.1. If k is a positive integer, then the k-th moment of X is E[X k ]. Moreover the variance var(X) of X is var(X) = E[(X − E[X])2 ] = E[X 2 ] − E[X]2 . Proposition 2.1. For any a ∈ R, var(aX) = a2 var(X). If X and Y are independent, then var(X + Y ) = var(X) + var(Y ). 1 2 3. Useful inequalities involving expectations Proposition 3.1. Let X be a random variable taking only non-negative values. Then show that for any a > 0, we have 1 P [X ≥ a] ≤ E[X]. a Here is the famous Cauchy-Schwarz inequality Theorem 3.1. For any X and Y E[XY ]2 ≤ E(X 2 )E(Y 2 ). 4. Usual random variables • Bernoulli parameter p: E[B(p)] = p and var[B(p)] = p(1 − p). • Geometric random variable of parameter p: E[G(p)] = 1/p and var[G(p)] = (1 − p)/p2 . • Binomial of parameter n and p: E[B(n, p)] = np and var[B(n, p)] = np(1 − p). • Poisson random variable or parameter λ: E[P (λ)] = λ and var[P (λ)] = λ. 5. A first law of large numbers 5.1. A part of the Borel-Cantelli lemma. For a sequence of events An , we introduce {An i.o.} = {An happens infinitely often} = ∩n ∪m≥n Am . P Proposition 5.1. If n P (An ) < ∞, then P [An i.o.] = 0. 5.2. Almost sure convergence and strong Law of Large Numbers. Definition 5.1. We say that a Sn → S almost surely, if {ω ∈ Ω, Sn (ω) → S(ω)} is an event of probability 1. We deduce the law of large numbers for i.i.d. random variables with finite fourth moment, i.e. E[Xi4 ] < ∞. Proposition 5.2. Let (Xi )i be a sequence of i.i.d. random variables with E[Xi4 ] < ∞, we have X1 + · · · Xn → E[X], a.s. n Remark 5.1. It can be seen by Cauchy-Schwartz that E[X]4 ≤ E[X 4 ] (take Y = 1 in the theorem). It is true in general that E[X] ≤ E[X p ]1/p for p ≥ 1. 3 Proofs Proposition 5.3. If P n P (An ) < ∞, then P [An i.o.] = 0. Proof. We have {Ak i.o.} ⊆ ∪m≥n An for any n, hence for any n P [Ak i.o.] ≤ P [∪m≥n Am ] X ≤ P [Am ]. m≥n By letting n go to infinity the RHS goes to 0, so P [Ak i.o.] = 0. Proposition 5.4. Let (Xi )i be a sequence of i.i.d. random variables with E[Xi4 ] < ∞, we have X1 + · · · Xn → E[X], a.s. n Proof. By substracting E[X] to all Xi , we see that Yi := Xi − E[X] verifies E[Yi ] = 0 and E[Yi4 ] < ∞. Moreover our proposition is equivalent to Y1 + · · · Yn → 0, a.s. n Now fix ε > 0 and introduce Y + · · · Y 1 n > ε}. Aεn = n We have, by Markov’s inequality (Proposition 3.1) 1 (5.1) P [Aεn ] < E[(Y1 + · · · Yn )4 ]. 4 (εn) Now X X X E[(Y1 + · · · Yn )4 ] = E[Yi4 ] + E[Yi Yj3 ] + E[Yi2 Yj2 ] i + i6=j X i6=j X E[Yi Yj Yk2 ] + i,j,k distinct E[Yi Yj Yk Yl ], i,j,k,l distinct which considering the fact that the Yi ’s are i.i.d. and with 0 mean we see that X X E[(Y1 + · · · Yn )4 ] = E[Yi4 ] + E[Yi2 ]E[Yj2 ], i i6=j (e.g. i6=j E[Yi Yj3 ] = i6=j E[Yi ]E[Yj3 ] = 0 by Proposition 1.2). Now recalling that E[Yi4 ] < ∞ implies that E[Yi2 ] < ∞, we can see that there exists a constant C < ∞ such that P P E[(Y1 + · · · Yn )4 ] ≤ Cn2 , 4 which using (5.1) means that C 1 , ε4 n2 which is summable. Using proposition 5.3, this implies that for any ε > 0 P [Aεn ] < P [Aεn i.o.] = 0. Considering the special case where ε = 1/k for k ∈ N we see P [A1/k i.o.] = 0, n such that taking a countable union i.o.}] = 0. P [∪k≥0 {A1/k n 1/k and by taking the complement (and recalling the definition of {An i.o.}), we see that h n Y + · · · Y oi 1 n P ∩k≥0 ∪n≥0 ∩m≥n < 1/k = 1. n n Hence the set ∩k≥0 ∪n≥0 ∩m≥n { Y1 +···Y < 1/k} has probability n Y +···Y1. Now let us see what this horrible thing means, if ω ∈ ∩k≥0 ∪n≥0 ∩m≥n { 1 n n < 1/k} we have Y1 (ω) + · · · Yn (ω) < 1/k, ∀k > 0, ∃n ≥ 0, ∀m ≥ n, n n < 1/k} which exactly means that for ω ∈ ∩k≥0 ∪n≥0 ∩m≥n { Y1 +···Y n Y1 (ω) + · · · Yn (ω) → 0. n As we argued at the beginning of the proof, this is enough to prove our proposition.