Math 141 - Lecture 7: Variance, Covariance, and Sums

advertisement
Math 141
Lecture 7: Variance, Covariance, and Sums
Albyn Jones1
1 Library 304
jones@reed.edu
www.people.reed.edu/∼jones/courses/141
Albyn Jones
Math 141
Last Time
Variance: expected squared deviation from the mean:
Var(X ) = E(X − µx )2 = σ 2
Standard Deviation:
SD(X ) =
p
Var(X ) = σ
Properties:
Var(a + bX ) = b2 Var(X )
SD(a + bX ) = bSD(X )
Interpretation: For a symmetric distribution, the standard
deviation may be considered a typical deviation from the
mean. For a strongly asymmetrical distribution, it is better
to work with quantiles (percentiles of the distribution).
Albyn Jones
Math 141
Covariance and Correlation
Definition: Covariance
Let X and Y be two RV’s with means µx and µy , respectively.
Covariance is the expected value of the products of deviations
from the means:
Cov(X , Y ) = E((X − µx )(Y − µy ))
Definition: Correlation
Correlation is just scale free covariance:
Cor(X , Y ) = ρxy =
X Y
Cov(X , Y )
= Cov( , )
σx σy
σx σy
Albyn Jones
Math 141
Covariance and Correlation
Extreme Cases: Covariance
If X = Y , i.e. the two RV’s are copies of each other, then
Cov(X , Y ) = Cov(X , X ) = E((X − µx )(X − µx )) = Var(X )
If X = −Y , then
Cov(X , Y ) = Cov(X , −X ) = −Var(X )
Extreme Cases: Correlation
If X = Y , i.e. the two RV’s are copies of each other, then
Cor(X , Y ) =
Var(X )
Cov(X , X )
=1
=
σx σx
σx2
If X = −Y , then
Cor(X , Y ) =
−Var(X )
Cov(X , −X )
=
= −1
2
σx
σx2
Albyn Jones
Math 141
Interpretation
Linear Association
Covariance and Correlation are measures of the degree of
linear association.
Independent RV’s have correlation 0, but correlation 0 does not
guarantee independence!
Albyn Jones
Math 141
Example
Positive Covariance: linear association with positive slope.
Cov (X , Y ) > 0
Albyn Jones
Math 141
Example
Negative Covariance: linear association with negative slope.
Cov (X , Y ) < 0
Albyn Jones
Math 141
Example
Zero Covariance: no linear association!
Cov (X , Y ) = 0
Albyn Jones
Math 141
The variance of a sum
The expected value of a sum is the sum of the expected values:
E(X + Y ) = E(X ) + E(Y ). What about variances?
Var(X + Y ) = E((X + Y ) − (µx + µy ))2
Collect the terms involving X and the terms involving Y :
E((X − µx ) + (Y − µy ))2
Recalling that (a + b)2 = a2 + 2ab + b2 , we have
E((X − µx )2 + (Y − µy )2 + 2(X − µx )(Y − µy ))
Finally, the expected value of a sum is the sum of the expected
values:
Var(X + Y ) = E(X − µx )2 + E(Y − µy )2 + 2E (X − µx )(Y − µy )
Albyn Jones
Math 141
The variance of a sum, part two
We have produced an expression for the variance of a sum:
Var(X + Y ) = E(X − µx )2 + E(Y − µy )2 + 2E (X − µx )(Y − µy )
The first two terms are Var(X ) and Var(Y ), the third is
2Cov(X , Y ). Thus
Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X , Y )
Albyn Jones
Math 141
The variance of a sum, part three
Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X , Y )
If Cov (X , Y ) > 0, then
Var(X + Y ) > Var(X ) + Var(Y )
If Cov (X , Y ) < 0, then
Var(X + Y ) < Var(X ) + Var(Y )
Albyn Jones
Math 141
The variance of a sum: Independence
Fact: If two RV’s are independent, we can’t predict one using
the other, so there is no linear association, and their covariance
is 0.
Theorem: If X and Y are independent, then
Var(X + Y ) = Var(X ) + Var(Y )
In other words, if the random variables are independent, then
the variance of their sum is the sum of their variances!
Remember Pythagorus? For independent RV’s, the SD of a
sum works like Euclidean distance for right triangles! If
Z =X +Y
q
σZ = σx2 + σy2
Albyn Jones
Math 141
Pythagorus
q
σX2 + σY2
σY
σX
For the mathematically inclined, covariance is an inner product
on the infinite dimensional vector space of random variables
with finite variance.
Albyn Jones
Math 141
Example: Variance of a Binomial RV
Let X be a Binomial(n,p) RV. We know E(X ) = np.
We also know that X is the sum of n independent Bernoulli(p)
trials Yi , each with E(Yi ) = p, and Var(Yi ) = pq where
q = (1 − p) so
Var(X ) = Var(Y1 + Y2 + . . . + Yn )
The Y ’s are independent, so their variances add:
Var(X ) = Var(Y1 ) + Var(Y2 ) + . . . + Var(Yn ) = npq
Hence we have shown
Var(X ) =
n
X
n k n−k
(k − np)
p q
= npq
k
2
k=0
Albyn Jones
Math 141
Example: 100 coin tosses
Toss a fair coin independently 100 times, and let X be the
number of Heads we get.
What is the appropriate probability model?
X ∼ Binomial(100, 1/2)
What is the expected number of heads?
What is σx2 , the variance of X ?
What is σx ?
What are typical outcomes of this experiment?
Albyn Jones
Math 141
Example: 1000 coin tosses
Toss a fair coin independently 1000 times, and let X be the
number of Heads we get.
X ∼ Binomial(1000, 1/2)
What is the expected number of heads?
What is σx2 , the variance of X ?
What is σx ?
What are typical outcomes of this experiment?
Albyn Jones
Math 141
Variance of a Poisson
Suppose X ∼ Poisson(µ). What are E(X ) and Var(X )? One
can compute them directly from the definition, by summing the
infinite series
X
E(X ) =
k · µk e−µ /k! = µ
and then
Var(X ) =
X
(k − µ)2 · µk e−µ /k !
Summing those series makes use of ideas from Math 112.
Can we guess the answer without using Math 112?
Albyn Jones
Math 141
Variance of a Poisson, Heuristically
Let X be a Binomial(n,p) RV, where n is large and p is small.
We know E(X ) = np, and Var(X ) = np(1 − p). If p is small,
(1 − p) is close to 1, and so Var(X ) ≈ np = E(X ).
Since that Binomial X is approximately a Poisson with
parameter µ = np, we should guess for X ∼ Poisson(µ):
E(X ) = µ
and
Var(X ) = µ
Albyn Jones
Math 141
Summary
Variance of a sum of two RV’s:
Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X , Y )
Variance of a sum of independent RV’s:
Var(X + Y ) = Var(X ) + Var(Y )
Pythagorus and the Standard Deviation
of a sum of
q
independent RV’s: SD(X + Y ) =
σx2 + σy2
Variance of X ∼ Binomial(n, p): Var(X ) = npq = np(1 − p)
Albyn Jones
Math 141
Download