Constraints on the Parameter Vector c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 33 Suppose y = Xβ + ε, where E(ε) = 0 and β ∈ Rp satisfies H0 β = h for some known p×q H of rank q and some known h. q×1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 33 Given that we know β satisfies H0 β = h, what functions c0 β are estimable, and how do we estimate them? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 3 / 33 A linear estimator d + a0 y is unbiased for c0 β in the restricted model iff E(d + a0 y) = c0 β c Copyright 2012 Dan Nettleton (Iowa State University) ∀ β 3 H0 β = h. Statistics 611 4 / 33 c0 β is estimable in the restricted model iff ∃ a linear estimator d + a0 y 3 E(d + a0 y) = c0 β ∀ β satisfying H0 β = h, i.e., iff ∃ a linear estimator that is unbiased for c0 β in the restricted model. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 33 Result 3.7: In the restricted model, d + a0 y is unbiased for c0 β iff ∃ l 3 c = X0 a + Hl c Copyright 2012 Dan Nettleton (Iowa State University) and d = l0 h. Statistics 611 6 / 33 Proof of Result 3.7: (⇐=) Suppose ∃ l 3 c = X0 a + Hl and d = l0 h. Then E(d + a0 y) = l0 h + a0 Xβ = l0 H0 β + a0 Xβ ∀ β 3 H0 β = h = (l0 H0 + a0 X)β ∀ β 3 H0 β = h = (X0 a + Hl)0 β ∀ β 3 H0 β = h = c0 β c Copyright 2012 Dan Nettleton (Iowa State University) ∀ β 3 H0 β = h. Statistics 611 7 / 33 (=⇒) First note that {β : H0 β = h} = {(H0 )− h + (I − (H0 )− H0 )z : z ∈ Rp } = {b∗ + Wz : z ∈ Rp }, where b∗ = (H0 )− h is one particular solution to H0 β = h and C(W) = N (H0 ) by Results A.12 and A.15, respectively. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 33 Now suppose E(d + a0 y) = d + a0 Xβ = c0 β ∀ β 3 H0 β = h. This is equivalent to d + a0 X(b∗ + Wz) = c0 (b∗ + Wz) ∀ z ∈ Rp ⇐⇒ d + a0 Xb∗ − c0 b∗ + (a0 X − c0 )Wz = 0 ⇐⇒ d + a0 Xb∗ − c0 b∗ = 0 and c Copyright 2012 Dan Nettleton (Iowa State University) ∀ z ∈ Rp W 0 (X0 a − c) = 0 by Result A.8. Statistics 611 9 / 33 Now W 0 (X0 a − c) = 0 implies that X0 a − c ∈ N (W 0 ) = C(W)⊥ = N (H0 )⊥ = C(H). ∴ ∃ m 3 Hm = X0 a − c ⇒ ∃ m 3 c = X0 a − Hm ⇒ ∃ l 3 c = X0 a + Hl. (l = −m.) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 33 Now d + a0 Xb∗ − c0 b∗ = 0 ⇒ d = c0 b∗ − a0 Xb∗ c Copyright 2012 Dan Nettleton (Iowa State University) = (X0 a + Hl)0 b∗ − a0 Xb∗ = l0 H0 b∗ + a0 Xb∗ − a0 Xb∗ = l 0 H 0 b∗ = l0 h. Statistics 611 11 / 33 Recall that in the unrestricted case, c0 β is estimable iff c ∈ C(X0 ). Result 3.7 says that c0 β is estimable in the restricted case iff c ∈ C([X0 , H]). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 33 Thus c0 β is estimable under unrestricted model ⇒ c0 β estimable under restricted model. However, the converse doesn’t hold. If C(X0 ) ⊂ C([X0 , H]), ∃ functions c0 β estimable in restricted case but nonestimable in unrestricted case. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 33 Example: Consider the one-way ANOVA model E(yij ) = µ + τi i = 1, . . . , t and j = 1, . . . , ni . Show that c0 β is estimable ∀ c ∈ Rp under restriction τ1 + · · · + τt = 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 33 H= " # 0 1 , h = [0], t×1 Then H0 β = h is equivalent to Pt i=1 τi C([X0 , H]) = C µ τ1 β = . . .. τt = 0. We have #! " 10 0 I 1 t×t = Rp , where p = t + 1. Thus, c ∈ C([X0 , H]) ∀ c ∈ Rp . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 33 " How do we know rank rank C " #! 10 I t×t " #! 0 1 " #! 10 I t×t # I 1 t×t has Rp as its column space? = t = p − 1. = 1. ∩C Thus, rank 10 0 " #! 0 1 = {0}. #! " 10 0 I 1 t×t = rank c Copyright 2012 Dan Nettleton (Iowa State University) " #! 10 I t×t + rank " #! 0 1 = p. Statistics 611 16 / 33 Alternatively, suppose " 10 0 I 1 t×t # z = 0. Then z1 + · · · + zp−1 z1 + zp =0 .. . zp−1 + zp .. . =0 ⇒ z1 + · · · + zp−1 =0 z = · · · = z 1 p−1 = −zp =0 ⇒ z1 = · · · = zp = 0 " # 10 0 ∴ is of full-column rank. I 1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 33 Now suppose we consider the constraints τ1 = τ2 = · · · = τt . What functions c0 β are estimable in this case? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 33 τ1 = τ2 = · · · = τt is equivalent to H0 β = h, where 00 H= 10 , −I h = 0, (t−1)×(t−1) ∵ C(H) ⊆ C(X0 ), (t−1)×1 µ τ1 β = . . .. τt C(X0 ) = C([X0 , H]). ∴ same functions estimable with or without restrictions. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 33 The Restricted Normal Equations (RNE) are " X0 X H H0 c Copyright 2012 Dan Nettleton (Iowa State University) 0 #" # b λ = " # X0 y h . Statistics 611 20 / 33 Result 3.8: The RNE are consistent. Proof: First show " X0 y # h c Copyright 2012 Dan Nettleton (Iowa State University) ∈C " X0 0 0 H0 #! . Statistics 611 21 / 33 The constraint equations are consistent and thus have a solution, say b∗ , such that H0 b∗ = h. Thus, " #" # X0 0 y ∴ 0 H0 " # X0 y h b∗ ∈C c Copyright 2012 Dan Nettleton (Iowa State University) " = X0 y # = H 0 b∗ " #! X0 0 . 0 H0 " X0 y # h Statistics 611 22 / 33 Now suppose that we can show N " #! X0 X H H0 0 ⊆N " X 0 0 H #! . Explain why this implies the RNE are consistent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 23 / 33 N (A) ⊆ N (B) ⇒ N (B)⊥ ⊆ N (A)⊥ by Result A.6. Now N (B)⊥ ⊆ N (A)⊥ ⇒ C(B0 ) ⊆ C(A0 ) by Result A.5. Thus, N " #! X0 X H H0 0 ⊆N " X 0 0 H c Copyright 2012 Dan Nettleton (Iowa State University) #! ⇒C " ⇒ " X0 0 0 # H0 X0 y h ∈C #! " #! X0 X H ⊆C " X0 X H H0 H0 #! 0 0 ⇒ RNE consistent. Statistics 611 24 / 33 Now show that N " #! X0 X H H0 c Copyright 2012 Dan Nettleton (Iowa State University) 0 ⊆N " X 0 0 H #! . Statistics 611 25 / 33 " Suppose X0 X H H0 0 #" # v1 v2 " # 0 = . Then 0 X0 Xv1 + Hv2 = 0 (1) H 0 v1 = 0 (2) and Multiplying (1) on the left by v01 gives v01 X0 Xv1 + v01 Hv2 = 0. By (2), v01 H = 00 . Thus v01 X0 Xv1 = 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 33 Now v01 X0 Xv1 = 0 ⇒ Xv1 = 0. Thus, (1) becomes X0 0 + Hv2 = 0 ⇒ Hv2 = 0. " ∴ X 0 0 H #" # v1 v2 " # 0 = , and it follows that 0 N " #! X0 X H H0 0 ⊆N " X 0 0 H #! . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 27 / 33 Result 3.9: If β̃ is the first p components of a solution to the RNE, then β̃ minimizes Q(b) = (y − Xb)0 (y − Xb) = ky − Xbk2 over b satisfying H0 b = h. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 28 / 33 Proof of Result 3.9: Suppose b is any vector satisfying H0 b = h. Then Q(b) = ky − Xbk2 = ky − Xβ̃ + X(β̃ − b)k2 = ky − Xβ̃k2 + kX(β̃ − b)k2 + 2(β̃ − b)0 X0 (y − Xβ̃). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 33 1/2 the cross product is (β̃ − b)0 X0 (y − Xβ̃). Because β̃ satisfies RNE, we have X0 Xβ̃ + Hλ = X0 y. ∴ X0 y − X0 Xβ̃ = Hλ. Thus 1/2 cross product is (β̃ − b)0 Hλ = λ0 H0 (β̃ − b) c Copyright 2012 Dan Nettleton (Iowa State University) = λ0 (H0 β̃ − Hb) = λ0 (h − h) = 0. Statistics 611 30 / 33 Thus, we have Q(b) = ky − Xβ̃ + X(β̃ − b)k2 = ky − Xβ̃k2 + kX(β̃ − b)k2 . ∴ Q(β̃) ≤ Q(b) with equality iff Xβ̃ = Xb. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 33 Result 3.10: If β̃ satisfies H0 β̃ = h and Q(β̃) ≤ Q(b) ∀ b 3 H0 b = h, then β̃ is the first p components of a solution to the RNE. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 32 / 33 Proof of Result 3.10: Let β ∗ denote the first p components of any solution to the RNE. By the proof of Result 3.9, it follows that Xβ ∗ = Xβ̃. Thus X0 y = X0 Xβ ∗ + Hλ = X0 Xβ̃ + Hλ so that " # β̃ λ solves the RNE. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 33 / 33