Stat 505 Gauss Markov Theorem

advertisement
Stat 505 Gauss Markov Theorem
September 12, 2014
The Gauss-Markov Theorem
y = Xβ+ with ∼ (0, σ 2 V ). Normality is not required. For any estimable cT β, the best (minimum
b where β
b solves the NE: (X T V −1 X)β = X T V −1 y.
variance) linear unbiased estimator of cT β is cT β
Note: smaller variance does not necessarily mean that SE of each coefficient is smaller. With multiple
coefficients to consider, minimum variance means that the difference between the variance-covariance
matrix of any other estimator and the variance-covariance matrix of the BLUE is non-negative
definite.
Proof:
1. Because cT β is estimable, c ∈ C(X T ), and can be written as c = X T a.
2. We need to show that X T V −1 X(X T V −1 X)g c = c. Because P V = X(X T V −1 X)g X T V −1
projects into C(X), it does not change X, and P V X = X. Then X T V −1 X(X T V −1 X)g c =
X T V −1 X(X T V −1 X)g X T a = X T P V T a = (P V X)T a = X T a = c.
3. Take any other linearly unbiased estimator of cT β, calling it d0 + dT y. Unbiased means
that for any β, E(d0 + dT y) = cT β, and that includes β = 0 which means d0 = 0. Also,
E(dT y) = dT Xβ for all β, so cT = dT X. Careful of this argument. Just cT β = dT Xβ for
a few β’s would not be enough to say cT = dT X, but if it’s true for all β, then plug in each
column of the identity matrix in turn and we get the equality we need.
4. Now how big is the variance of this generic linear unbiased estimator? We’ll use the “add zero”
trick.
b + dT y − cT β)
b = Var(cT β)
b + Var(dT y − cT β)
b + 2Cov(cT β,
b dT y − cT β)
b
Var(dT y) = Var(cT β
b is a
If we can show that the covariance term is 0, then we’ll be done, because Var(dT y − cT β)
variance-covariance matrix, and must be non-negative definite.
In general, Cov(aT y, bT y) = aT Var(y)b.
b dT y − cT β)
b
Cov(cT β,
= cT (X T V −1 X)g X T V −1 [σ 2 V ][d − V −1 X(X T V −1 X)g c]
= σ 2 cT (X T V −1 X)g X T [d − V −1 X(X T V −1 X)g c]
= σ 2 cT (X T V −1 X)g [X T d − X T V −1 X(X T V −1 X)g Xa]
= σ 2 cT (X T V −1 X)g [c − X T P V T a]
=
σ 2 cT (X T V −1 X)g [c
− c]
from 3 above
from 2 above
= 0
b in the sense that the difference between
And we proved Gauss-Markov: Var(dT y) ≥ Var(cT β)
the two matrices is a non-negative definite matrix. In fact, the only time they are equal is if
b
d = V −1 X(X T V −1 X)g X T c for some generalized inverse and dT y = cT β.
1
Download