The Aitken Model c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 41 The Aitken Model (AM): Suppose y = Xβ + ε, where E(ε) = 0 and Var(ε) = σ 2 V for some σ 2 > 0 and some known positive definite matrix V. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 41 Because σ 2 V is a variance matrix, V is symmetric and positive definite, ∴ ∃ a symmetric and positive definite matrix V 1/2 3 V 1/2 V 1/2 = V and V 1/2 is nonsingular with V −1/2 ≡ (V 1/2 )−1 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 3 / 41 It follows that under the AM, V −1/2 y = V −1/2 Xβ + V −1/2 ε ⇐⇒ z = Uβ + δ, where z = V −1/2 y, U = V −1/2 X, and δ = V −1/2 ε with E(δ) = 0 and Var(δ) = V −1/2 σ 2 VV −1/2 = σ 2 V −1/2 V 1/2 V 1/2 V −1/2 = σ 2 I. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 41 Thus, the AM for y is equivalent to the GMM for z = V −1/2 y. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 41 Estimability in the AM: The AM is just a special case of the GLM. Thus, as before, c0 β is estimable iff c ∈ C(X0 ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 41 Note that C(X0 ) = C(X0 V −1/2 ) = C((V −1/2 X)0 ) = C(U0 ). Thus, c ∈ C(X0 ) ⇐⇒ c ∈ C(U0 ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 41 Let Ly be the collection of all linear estimators that are linear in y. Let Lz be the collection of all linear estimators in z = V −1/2 y. Show that c Copyright 2012 Dan Nettleton (Iowa State University) Ly = Lz . Statistics 611 8 / 41 Proof: Let d + a0 y be any arbitrary linear estimator in Ly . Then d + a0 y = d + a0 V 1/2 V −1/2 y = d + a0 V 1/2 z = d + h0 z ∈ Lz , where h0 = a0 V 1/2 . Thus, Ly ⊆ Lz . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 9 / 41 Now suppose d + a0 z is an arbitrary linear estimator in Lz . Then d + a0 z = d + a0 V −1/2 y = d + h0 y ∈ Ly , where h0 = a0 V −1/2 . 2 ∴ Lz ⊆ Ly and it follows that Ly = Lz . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 41 Estimating E(y) under the Aitken Model: Consider QGLS (b) = (y − Xb)0 V −1 (y − Xb). Finding β̂ GLS that minimizes QGLS (b) over b ∈ Rp is a Generalized Least Squares (GLS) problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 41 If QGLS (β̂ GLS ) ≤ QGLS (b) ∀ b ∈ Rp , β̂ GLS is a solution to the GLS problem. Xβ̂ GLS is known as GLS estimator of E(y) if β̂ GLS is a solution to the GLS problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 41 Show that β̂ GLS minimizes QGLS (b) over b ∈ Rp iff β̂ GLS solves X0 V −1 Xb = X0 V −1 y. These equations are known as the Aitken Equations (AE). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 41 Proof: (y − Xb)0 V −1 (y − Xb) = (y − Xb)0 V −1/2 V −1/2 (y − Xb) = (V −1/2 (y − Xb))0 (V −1/2 (y − Xb)) = (V −1/2 y − V −1/2 Xb)0 (V −1/2 y − V −1/2 Xb) = (z − Ub)0 (z − Ub). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 41 By Result 2.3, (z − Ub)0 (z − Ub) is minimized at b∗ iff b∗ solves NE c Copyright 2012 Dan Nettleton (Iowa State University) U0 Ub = U0 z. Statistics 611 15 / 41 Now U0 Ub = U0 z is equivalent to (V −1/2 X)0 (V −1/2 X)b = (V −1/2 X)0 (V −1/2 y) ⇐⇒ X0 V −1/2 V −1/2 Xb = X0 V −1/2 V −1/2 y ⇐⇒ X0 V −1 Xb = X0 V −1 y. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 16 / 41 ∴ (y − Xb)0 V −1 (y − Xb) is minimized over b ∈ Rp by b∗ iff b∗ solves the AE X0 V −1 Xb = X0 V −1 y. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 41 Henceforth, we will use β̂ GLS to denote a solution to the AE. We will use β̂ or β̂ OLS to denote a solution to the NE X0 Xb = X0 y. (Ordinary Least Squares) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 41 Because of the equivalence between the AE X0 V −1 Xb = X0 V −1 y and the NE U0 Ub = U0 z, we know a solution to AE is (U0 U)− U0 z = (X0 V −1 X)− X0 V −1 y. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 41 Theorem 4.2 (Aitken Theorem): Suppose the Aitken Model holds. If c0 β is estimable, then c0 β̂ GLS is the BLUE of c0 β. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 20 / 41 Proof: By Theorem 4.1, the BLUE of c0 β is c0 (U0 U)− U0 z = c0 (X0 V −1 X)− X0 V −1 y = c0 β̂ GLS . See also Exercises 4.22, 4.23. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 41 Suppose c0 β is estimable. Suppose the AM holds. Find Var(c0 β̂ GLS ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 41 We know c0 β is estimable under the AM y = Xβ + ε if and only if c0 β is estimable under the GMM z = Uβ + δ. Furthermore, we know c0 β̂ GLS = c0 (X0 V −1 X)− X0 V −1 y = c0 (U0 U)− U0 z. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 23 / 41 Thus, Var(c0 β̂ GLS ) = Var(c0 (U0 U)− U0 z) = σ 2 c0 (U0 U)− c = σ 2 c0 ((V −1/2 X)0 (V −1/2 X))− c = σ 2 c0 (X0 (V −1/2 )0 V −1/2 X)− c = σ 2 c0 (X0 V −1/2 V −1/2 X)− c = σ 2 c0 (X0 V −1 X)− c. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 41 Estimation of σ 2 under the Aitken Model: An unbiased estimator of σ 2 is z0 (I−PU )z n−r based on our previous result for the GMM. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 41 Now, note that z0 (I − PU )z = z0 (I − PU )0 (I − PU )z = k(I − PU )zk2 = kz − PU zk2 = kz − U(U0 U)− U0 zk2 = kz − Uβ̂ GLS k2 = kV −1/2 y − V −1/2 Xβ̂ GLS k2 = kV −1/2 (y − Xβ̂ GLS )k2 = (y − Xβ̂ GLS )0 V −1 (y − Xβ̂ GLS ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 41 Thus, 2 σ̂GLS ≡ (y − Xβ̂ GLS )0 V −1 (y − Xβ̂ GLS ) n−r is an unbiased estimator of σ 2 under the AM. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 27 / 41 A Simple Example Suppose for i = 1, . . . , n, yi = βxi + εi , where ε1 , . . . , εn are uncorrelated, E(εi ) = 0 and Var(εi ) = σ 2 xi > 0. Find the BLUE of β and an unbiased estimator of σ 2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 28 / 41 We have X = x, x1 0 . . . 0 . .. 0 x . .. 2 V = diag(x) = . . . .. .. ... 0 0 . . . 0 xn Then X0 V −1 X = x0 diag(1/x)x = 10 x = n X xi . i=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 41 X0 V −1 y = x0 diag(1/x)y = 10 y = n X yi . i=1 ∴ β̂ GLS = (X0 V −1 X)− XV −1 y Pn yi = Pni=1 i=1 xi is the BLUE of β. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 41 Note that to find β̂ GLS in this simple example, we solve a weighted least squares problem; i.e., β̂ GLS minimizes QGLS (b) = (y − Xb)0 V −1 (y − Xb) = (y − xb)0 diag(1/x)(y − xb) n X 1 = (yi − bxi )2 . xi i=1 The weights in this case are 1/xi (i = 1, . . . , n). Thus, the estimator pays more attention to (yi − bxi )2 when xi is small. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 41 (y − Xβ̂ GLS )0 V −1 (y − Xβ̂ GLS ) n−r (y − Xβ̂ GLS )0 diag(1/x)(y − Xβ̂ GLS ) = n−r Pn 1 ȳ 2 i=1 xi (yi − xi x̄ ) . = n−r 2 σ̂GLS = c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 32 / 41 Find Var(β̂ GLS ) for this example. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 33 / 41 Var(β̂ GLS ) = σ 2 c0 (X0 V −1 X)− c c Copyright 2012 Dan Nettleton (Iowa State University) = σ 2 1(x0 diag(1/x)x)− 1 !−1 n X xi = σ2 i=1 2 σ = Pn i=1 xi . Statistics 611 34 / 41 Alternatively, Pn yi i=1 Var(β̂ GLS ) = Var Pn i=1 xi 1 = Pn Var ( i=1 xi )2 n X ! yi i=1 n X 1 Var(yi ) = Pn ( i=1 xi )2 i=1 Pn σ 2 xi = Pi=1 n ( i=1 xi )2 Pn 2 P i=1 xi =σ ( ni=1 xi )2 σ2 = Pn c Copyright 2012 Dan Nettleton (Iowa State University) i=1 xi . Statistics 611 35 / 41 Find β̂ OLS for this example. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 36 / 41 β̂ OLS = (X0 X)−1 X0 y c Copyright 2012 Dan Nettleton (Iowa State University) = (x0 x)−1 x0 y Pn xi yi = Pi=1 . n 2 i=1 xi Statistics 611 37 / 41 Find Var(β̂ OLS ) in this example. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 38 / 41 Var(β̂ OLS ) = Var((X0 X)−1 X0 y) = (X0 X)−1 X0 (σ 2 V)X(X0 X)−1 = σ 2 (x0 x)−1 x0 diag(x)x(x0 x)−1 Pn 3 x 2 = σ Pni=1 2i 2 . ( i=1 xi ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 39 / 41 Alternatively, Pn i=1 xi yi Var(β̂ OLS ) = Var Pn 2 i=1 xi n X 1 x2 Var(yi ) = Pn 2 2 ( i=1 xi ) i=1 i Pn 3 x 2 = σ Pni=1 2i 2 . ( i=1 xi ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 40 / 41 Pn 3 x Var(β̂ OLS ) = σ Pni=1 2i 2 ( i=1 xi ) 2 σ2 ≥ Pn i=1 xi = Var(β̂ GLS ), with equality iff x1 = · · · = xn ; i.e., iff GMM holds. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 41 / 41