ESTIMATING THE VARIANCE DAVAR KHOSHNEVISAN Recall the linear model Y = Xβ + ε. (1) The most standard assumption on the noises is that εi ’s are i.i.d. N (0, σ 2 ) for a fixed unknown parameter σ > 0. The MLE for σ 2 is n 2 1X 2 1 1 b (2) σ b2 = εi = kεk2 = Y − X β . n n n i=1 Write X βb = PC (X) Y to obtain 2 2 1 1 (3) σ b2 = Y − PC (X) Y = In − PC (X) Y . n n Lemma 0.1. If S denotes a subspace of Rn , then In − PS = PS ⊥ , where S ⊥ denotes the orthogonal complement to S; i.e., S ⊥ = {x ∈ Rn : x ⊥ S} . (4) Proof. First, let us check that if x ∈ Rn then (In − PS )x is orthogonal to PS x. Indeed, [(In − PS ) x]0 PS x = x0 − x0 P0S PS x (5) = x0 PS x − x0 P2S x, because P0S = PS . Since P2S = PS , it follows that (In − PS )x is orthogonal to PS x, as promised. Next, let us prove that In − PS is idempotent; i.e., a projection matrix. This too is a routine check, viz., (6) (In − PS )2 = In − 2PS + P2S = In − PS , as claimed. We have shown, thus far, that In − PS is a projection matrix, and it projects x ∈ Rn to some point in S ⊥ . Thus, there exists a subspace T of Rn such that In − PS = PT . It remains to verify that T = S ⊥ ; this follows from the fact that any x ∈ Rn can be written as x = PS x + (In − PS )x. ˜ In summary, we have shown that b = PC (X) Y , β 2 (7) c2 = 1 σ PC (X)⊥ Y . n Date: August 30, 2004. 1 2 DAVAR KHOSHNEVISAN It will turn out that if S ⊥ T —and under the assumption that the εi ’s are i.i.d. normals—then PS Y is statistically independent of PT Y . Therefore, in particular, we will see soon that, in the normal-errors model, c2 are independent. b and σ (8) β