Week8

advertisement
Independence of random variables
• Definition
Random variables X and Y are independent if their joint distribution
function factors into the product of their marginal distribution functions
FX ,Y x, y   FX x FY  y 
• Theorem
Suppose X and Y are jointly continuous random variables. X and Y are
independent if and only if given any two densities for X and Y their product
is the joint density for the pair (X,Y) i.e.
Proof:
f X ,Y  x, y   f X  x  f Y  y 
• If X and Y are independent random variables and Z =g(X), W = h(Y) then Z,
W are also independent.
week 8
1
Example
• Suppose X and Y are discrete random variables whose values are the nonnegative integers and their joint probability function is
p X ,Y x, y  
1 x y     
 e
x! y!
x, y  0,1,2...
Are X and Y independent? What are their marginal distributions?
• Factorization is enough for independence, but we need to be careful of
constant terms for factors to be marginal probability functions.
week 8
2
Example and Important Comment
• The joint density for X, Y is given by

4 x  y 2
f X ,Y x, y   
 0

x, y  0 , x  y  1
otherwise
• Are X, Y independent?
• Independence requires that the set of points where the joint density is
positive must be the Cartesian product of the set of points where the
marginal densities are positive i.e. the set of points where fX,Y(x,y) >0
must be (possibly infinite) rectangles.
week 8
3
Conditional densities
• If X, Y jointly distributed continuous random variables, the conditional
density function of Y | X is defined to be
fY | X  y | x  
f X ,Y x, y 
f X x 
if fX(x) > 0 and 0 otherwise.
• If X, Y are independent then f Y | X  y | x   f Y  y .
• Also,
f X ,Y  x, y   f Y | X  y | x  f X x 
Integrating both sides over x we get

fY  y    fY | X  y | x  f X x dx

• This is a useful application of the law of total probability for the continuous
case.
week 8
4
Example
• Consider the joint density
2 e y
f X ,Y x, y   
 0
0 x y
otherwise
• Find the conditional density of X given Y and the conditional density of Y
given X.
week 8
5
Properties of Expectations Involving Joint Distributions
• For random variables X, Y and constants a, b  R
E(aX + bY) = aE(X) + bE(Y)
Proof:
• For independent random variables X, Y
E(XY) = E(X)E(Y)
whenever these expectations exist.
Proof:
week 8
6
Covariance
• Recall: Var(X+Y) = Var(X) + Var(Y) +2 E[(X-E(X))(Y-E(Y))]
• Definition
For random variables X, Y with E(X), E(Y) < ∞, the covariance of X and Y is
Cov X , Y   E X  E X Y  EY 
• Covariance measures whether or not X-E(X) and Y-E(Y) have the same sign.
• Claim:
Cov X , Y   E XY   E X EY 
Proof:
• Note: If X, Y independent then E(XY) =E(X)E(Y), and Cov(X,Y) = 0.
week 8
7
Example
• Suppose X, Y are discrete random variables with probability function given
by
y
x -1
0
1
pX(x)
-1
1/8
1/8
1/8
0
1/8
0
1/8
1
1/8
1/8
1/8
pY(y)
• Find Cov(X,Y). Are X,Y independent?
week 8
8
Important Facts
• Independence of X, Y implies Cov(X,Y) = 0 but NOT vice versa.
• If X, Y independent then Var(X+Y) = Var(X) + Var(Y).
• If X, Y are NOT independent then
Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y).
• Cov(X,X) = Var(X).
week 8
9
Example
• Suppose Y ~ Binomial(n, p). Find Var(Y).
week 8
10
Properties of Covariance
For random variables X, Y, Z and constants a, b, c, d  R
• Cov(aX+b, cY+d) = acCov(X,Y)
• Cov(X+Y, Z) = Cov(X,Z) + Cov(Y,Z)
• Cov(X,Y) = Cov(Y, X)
week 8
11
Correlation
• Definition
For X, Y random variables the correlation of X and Y is
 X , Y  
Cov X , Y 
V  X  V Y 
whenever V(X), V(Y) ≠ 0 and all these quantities exists.
• Claim:
ρ(aX+b,cY+d) = ρ(X,Y)
Proof:
• This claim means that the correlation is scale invariant.
week 8
12
Theorem
• For X, Y random variables, whenever the correlation ρ(X,Y) exists it must
satisfy
-1 ≤ ρ(X,Y) ≤ 1
Proof:
week 8
13
Interpretation of Correlation ρ
• ρ(X,Y) is a measure of the strength and direction of the linear relationship
between X, Y.
• If X, Y have non-zero variance, then
   1,1
.
• If X, Y independent, then ρ(X,Y) = 0. Note, it is not the only time when
ρ(X,Y) = 0 !!!
• Y is a linearly increasing function of X if and only if ρ(X,Y) = 1.
• Y is a linearly decreasing function of X if and only if ρ(X,Y) = -1.
week 8
14
Example
• Find Var(X - Y) and ρ(X,Y) if X, Y have the following joint density
3x
f X ,Y x, y   
0
0  y  x 1
otherwise
week 8
15
Conditional Expectation
• For X, Y discrete random variables, the conditional expectation of Y given
X = x is
E Y | X  x    y  pY | X  y | x 
y
and the conditional variance of Y given X = x is
V Y | X  x     y  E Y | X  x   pY | X  y | x 
2
y
 E Y 2 | X  x   E Y | X  x 
2
where these are defined only if the sums converges absolutely.
• In general,
E hY  | X  x    h y   pY | X  y | x 
y
week 8
16
• For X, Y continuous random variables, the conditional expectation of
Y given X = x is

E Y | X  x    y  fY | X  y | x dy

and the conditional variance of Y given X = x is
 y  E Y | X

V Y | X  x   

 x   f Y | X  y | x dy
2
 E Y 2 | X  x   E Y | X  x 
2
• In general,
E hY  | X  x    h y   f Y | X  y | x dy
y
week 8
17
Example
• Suppose X, Y are continuous random variables with joint density function
e  y
f X ,Y x, y   
0
y  0, 0  x  1
otherwise
• Find E(X | Y = 2).
week 8
18
More about Conditional Expectation
• Assume that E(Y | X = x) exists for every x in the range of X. Then,
E(Y | X ) is a random variable. The expectation of this random variable is
E [E(Y | X )]
• Theorem
E [E(Y | X )] = E(Y)
This is called the “Law of Total Expectation”.
Proof:
week 8
19
Example
• Suppose we roll a fair die; whatever number comes up we toss a coin that
many times. What is the expected number of heads?
week 8
20
Theorem
• For random variables X, Y
V(Y) = V [E(Y|X)] + E[V(Y|X)]
Proof:
week 8
21
Example
• Let X ~ Geometric(p).
Given X = x, let Y have conditionally the Binomial(x, p) distribution.
• Scenario: doing Bernoulli trails with success probability p until 1st success
so X : number of trails. Then do x more trails and count the number of
success which is Y.
• Find, E(Y), V(Y).
week 8
22
Law of Large Numbers
• Toss a coin n times.
• Suppose

1
Xi  

0
if i th toss came up H
if i th toss came up T
• Xi’s are Bernoulli random variables with p = ½ and E(Xi) = ½.
1 n
• The proportion of heads is X n   X i .
n i 1
• Intuitively X n approaches ½ as n  ∞ .
week 8
23
Markov’s Inequality
• If X is a non-negative random variable with E(X) < ∞ and a >0 then,
P X  a  
EX 
a
Proof:
week 8
24
Chebyshev’s Inequality
• For a random variable X with E(X) < ∞ and V(X) < ∞, for any a >0
P X  E  X   a  
V X 
a2
• Proof:
week 8
25
Back to the Law of Large Numbers
• Interested in sequence of random variables X1, X2, X3,… such that the
random variables are independent and identically distributed (i.i.d).
Let
1 n
Xn   Xi
n i 1
Suppose E(Xi) = μ , V(Xi) = σ2, then
1 n
 1 n
E X n   E   X i    E  X i   
 n i 1  n i 1
and
1 n
 1
V X n   V   X i   2
 n i 1  n
n
V X  
i 1
i
2
n
• Intuitively, as n  ∞, V X n   0 so X n  E X n   
week 8
26
• Formally, the Weak Law of Large Numbers (WLLN) states the following:
• Suppose X1, X2, X3,…are i.i.d with E(Xi) = μ < ∞ , V(Xi) = σ2 < ∞, then for
any positive number a


P Xn    a  0
as n  ∞ .
This is called Convergence in Probability.
Proof:
week 8
27
Example
• Flip a coin 10,000 times. Let

1
Xi  

0
if i th toss came up H
if i th toss came up T
• E(Xi) = ½ and V(Xi) = ¼ .
• Take a = 0.01, then by Chebyshev’s Inequality


1
1
1
1
P X n   0.01 


2
2
4

 410,000 0.01
• Chebyshev Inequality gives a very weak upper bound.
• Chebyshev Inequality works regardless of the distribution of the Xi’s.
week 8
28
Strong Law of Large Number
• Suppose X1, X2, X3,…are i.i.d with E(Xi) = μ < ∞ , then X n converges to μ
as n  ∞ with probability 1. That is
1


P lim  X 1  X 2    X n      1
 n n

• This is called convergence almost surely.
week 8
29
Download