EXPECTATION 1. Expectation Definition 1.1. The expectation, or

advertisement
EXPECTATION
1. Expectation
Definition 1.1. The expectation, or expected value, or mean value of
X with mass function f is defined by
X
E[X] =
xf (x).
x
whenever the sum is absolutely convergent.
Proposition 1.1. If X has mass function f and g : R → R, then
X
E(g(X)) =
g(x)f (x),
x
whenever this sum is absolutely convergent.
Theorem 1.1. We have
(1)
(2)
(3)
(4)
If X ≥ 0, then E[X] ≥ 0,
if a, b ∈ R, then E[aX + bY ] = aE[X] + bE[Y ],
E[1{A}] = P [A],
E[1] = 1.
Proposition 1.2. If X and Y are independent, then
E[XY ] = E[X]E[Y ].
2. Variance
Definition 2.1. If k is a positive integer, then the k-th moment of X is
E[X k ]. Moreover the variance var(X) of X is
var(X) = E[(X − E[X])2 ] = E[X 2 ] − E[X]2 .
Proposition 2.1. For any a ∈ R,
var(aX) = a2 var(X).
If X and Y are independent, then
var(X + Y ) = var(X) + var(Y ).
1
2
3. Useful inequalities involving expectations
Proposition 3.1. Let X be a random variable taking only non-negative values. Then show that for any a > 0, we have
1
P [X ≥ a] ≤ E[X].
a
Here is the famous Cauchy-Schwarz inequality
Theorem 3.1. For any X and Y
E[XY ]2 ≤ E(X 2 )E(Y 2 ).
4. Usual random variables
• Bernoulli parameter p: E[B(p)] = p and var[B(p)] = p(1 − p).
• Geometric random variable of parameter p: E[G(p)] = 1/p and var[G(p)] =
(1 − p)/p2 .
• Binomial of parameter n and p: E[B(n, p)] = np and var[B(n, p)] =
np(1 − p).
• Poisson random variable or parameter λ: E[P (λ)] = λ and var[P (λ)] =
λ.
5. A first law of large numbers
5.1. A part of the Borel-Cantelli lemma. For a sequence of events An ,
we introduce
{An i.o.} = {An happens infinitely often} = ∩n ∪m≥n Am .
P
Proposition 5.1. If n P (An ) < ∞, then
P [An i.o.] = 0.
5.2. Almost sure convergence and strong Law of Large Numbers.
Definition 5.1. We say that a Sn → S almost surely, if {ω ∈ Ω, Sn (ω) →
S(ω)} is an event of probability 1.
We deduce the law of large numbers for i.i.d. random variables with finite
fourth moment, i.e. E[Xi4 ] < ∞.
Proposition 5.2. Let (Xi )i be a sequence of i.i.d. random variables with
E[Xi4 ] < ∞, we have
X1 + · · · Xn
→ E[X],
a.s.
n
Remark 5.1. It can be seen by Cauchy-Schwartz that E[X]4 ≤ E[X 4 ] (take
Y = 1 in the theorem). It is true in general that E[X] ≤ E[X p ]1/p for p ≥ 1.
3
Proofs
Proposition 5.3. If
P
n
P (An ) < ∞, then
P [An i.o.] = 0.
Proof. We have {Ak i.o.} ⊆ ∪m≥n An for any n, hence for any n
P [Ak i.o.] ≤ P [∪m≥n Am ]
X
≤
P [Am ].
m≥n
By letting n go to infinity the RHS goes to 0, so P [Ak i.o.] = 0.
Proposition 5.4. Let (Xi )i be a sequence of i.i.d. random variables with
E[Xi4 ] < ∞, we have
X1 + · · · Xn
→ E[X],
a.s.
n
Proof. By substracting E[X] to all Xi , we see that Yi := Xi − E[X] verifies
E[Yi ] = 0 and E[Yi4 ] < ∞. Moreover our proposition is equivalent to
Y1 + · · · Yn
→ 0,
a.s.
n
Now fix ε > 0 and introduce
Y
+
·
·
·
Y
1
n
> ε}.
Aεn = n
We have, by Markov’s inequality (Proposition 3.1)
1
(5.1)
P [Aεn ] <
E[(Y1 + · · · Yn )4 ].
4
(εn)
Now
X
X
X
E[(Y1 + · · · Yn )4 ] =
E[Yi4 ] +
E[Yi Yj3 ] +
E[Yi2 Yj2 ]
i
+
i6=j
X
i6=j
X
E[Yi Yj Yk2 ] +
i,j,k distinct
E[Yi Yj Yk Yl ],
i,j,k,l distinct
which considering the fact that the Yi ’s are i.i.d. and with 0 mean we see that
X
X
E[(Y1 + · · · Yn )4 ] =
E[Yi4 ] +
E[Yi2 ]E[Yj2 ],
i
i6=j
(e.g. i6=j E[Yi Yj3 ] = i6=j E[Yi ]E[Yj3 ] = 0 by Proposition 1.2).
Now recalling that E[Yi4 ] < ∞ implies that E[Yi2 ] < ∞, we can see that
there exists a constant C < ∞ such that
P
P
E[(Y1 + · · · Yn )4 ] ≤ Cn2 ,
4
which using (5.1) means that
C 1
,
ε4 n2
which is summable. Using proposition 5.3, this implies that for any ε > 0
P [Aεn ] <
P [Aεn i.o.] = 0.
Considering the special case where ε = 1/k for k ∈ N we see
P [A1/k
i.o.] = 0,
n
such that taking a countable union
i.o.}] = 0.
P [∪k≥0 {A1/k
n
1/k
and by taking the complement (and recalling the definition of {An i.o.}),
we see that
h
n Y + · · · Y oi
1
n
P ∩k≥0 ∪n≥0 ∩m≥n <
1/k
= 1.
n
n
Hence the set ∩k≥0 ∪n≥0 ∩m≥n { Y1 +···Y
< 1/k} has probability
n
Y +···Y1. Now let
us see what this horrible thing means, if ω ∈ ∩k≥0 ∪n≥0 ∩m≥n { 1 n n < 1/k}
we have
Y1 (ω) + · · · Yn (ω) < 1/k,
∀k > 0, ∃n ≥ 0, ∀m ≥ n,
n
n
< 1/k}
which exactly means that for ω ∈ ∩k≥0 ∪n≥0 ∩m≥n { Y1 +···Y
n
Y1 (ω) + · · · Yn (ω)
→ 0.
n
As we argued at the beginning of the proof, this is enough to prove our
proposition.
Download