Document 10639673

Estimating Estimable Functions of β c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 1 / 17 The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear Model, the distribution of y depends on β only through Xβ, i.e., y ∼ (Xβ, σ 2 I) or y ∼ N(Xβ, σ 2 I) If X is not of full column rank, there are infinitely many vectors in the set {b : Xb = Xβ} for any fixed value of β. Thus, no matter what the value of E(y), there will be infinitely many vectors b such that Xb = E(y) when X is not of full column rank. The response vector y can help us learn about E(y) = Xβ, but when X is not of full column rank, there is no hope of learning about β alone unless additional information about β is available. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 2 / 17 Treatment Effects Model Researchers randomly assigned a total of six experimental units to two treatments and measured a response of interest. yij = µ + τi + ij ,              µ + τ1 11   µ + τ1   12         µ + τ1   13  = +    µ + τ2   21         µ + τ2   22  µ + τ2 23    1 1 0 11   1 1 0    12       1 1 0  µ  =   τ1  +  13   1 0 1   21    τ2    1 0 1   22 23 1 0 1 y11  y12   y13   y21   y22 y23  y11 y12 y13 y21 y22 y23  i = 1, 2; j = 1, 2, 3 c Copyright 2012 Dan Nettleton (Iowa State University)         Statistics 511 3 / 17 Treatment Effects Model (continued) In this case, it makes no sense to estimate β = [µ, τ1 , τ2 ]0 because there are multiple (infinitely many, in fact) choices of β that define the same mean for y. For example,         µ 5 0 999  τ1  =  −1  ,  4  , or  −995  τ2 1 6 −993 all yield same Xβ = E(y). When multiple values for β define the same E(y), we say that β is non-estimable. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 4 / 17 Estimable Functions of β A linear function of β, Cβ, is said to be estimable if there is a linear function of y, Ay, that is an unbiased estimator of Cβ. Otherwise, Cβ is said to be non-estimable. Note that Ay is an unbiased estimator of Cβ if and only if E(Ay) = Cβ ∀ β ∈ IRp ⇐⇒ AXβ = Cβ ∀ β ∈ IRp ⇐⇒ AX = C. This says that we can estimate Cβ as long as Cβ = AXβ = AE(y) for some A, i.e., as long as Cβ is a linear function of E(y). The bottom line is that we can always estimate E(y) and all linear functions of E(y); all other linear functions of β are non-estimable. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 5 / 17 Treatment Effects Model (continued)     E(y) = Xβ =     1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1        µ     τ1  =      τ2   µ + τ1 µ + τ1 µ + τ1 µ + τ2 µ + τ2 µ + τ2      =⇒    [1, 0, 0, 0, 0, 0]Xβ = [1, 1, 0]β = µ + τ1 [0, 0, 0, 1, 0, 0]Xβ = [1, 0, 1]β = µ + τ2 [1, 0, 0, −1, 0, 0]Xβ = [0, 1, −1]β = τ1 − τ2 are estimable functions of β. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 6 / 17 Estimating Estimable Functions of β If Cβ is estimable, then there exists a matrix A such that C = AX and Cβ = AXβ = AE(y) for any β ∈ IRp . It makes sense to estimate Cβ = AXβ = AE(y) by d = Aŷ = APX y = AX(X0 X)− X0 y = AX(X0 X)− X0 Xβ̂ AE(y) = APX Xβ̂ = AXβ̂ = Cβ̂. Cβ̂ is called the Ordinary Least Squares (OLS) estimator of Cβ. Note that although the “hat” is on β, it is Cβ that we are estimating. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 7 / 17 Invariance of Cβ̂ to the Choice of β̂ Although there are infinitely many solutions to the normal equations when X is not of full column rank, Cβ̂ is the same for all normal equation solutions β̂ whenever Cβ is estimable. To see this, suppose β̂ 1 and β̂ 2 are any two solutions to the normal equations. Then Cβ̂ 1 = AXβ̂ 1 = APX Xβ̂ 1 = AX(X0 X)− X0 Xβ̂ 1 = AX(X0 X)− X0 y = AX(X0 X)− X0 Xβ̂ 2 = APX Xβ̂ 2 = AXβ̂ 2 = Cβ̂ 2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 8 / 17 Treatment Effects Model (continued) Suppose our aim is to estimate τ1 − τ2 . As noted before,     Xβ =     1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1       µ      τ1  =     τ2    µ + τ1 µ + τ1 µ + τ1 µ + τ2 µ + τ2 µ + τ2      =⇒    [1, 0, 0, −1, 0, 0]Xβ = [0, 1, −1]β = τ1 − τ2 . Thus, we can compute the OLS estimator of τ1 − τ2 as [1, 0, 0, −1, 0, 0]ŷ = [0, 1, −1]β̂, where ŷ = X(X0 X)− X0 y and β̂ is any solution to the normal equations. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 9 / 17 Treatment Effects Model (continued) The normal equations in this case are         1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0                1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1        b1     b2  =      b3   1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 0                y11 y12 y13 y21 y22 y23              6 3 3 b1 y·· ⇐⇒  3 3 0   b2  =  y1·  . 3 0 3 b3 y2· c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 10 / 17 Treatment Effects Model (continued)     ȳ·· 0 are each solutions to     the normal equations β̂ 1 ≡ ȳ1· − ȳ·· and β̂ 2 ≡ ȳ1· ȳ2· − ȳ·· ȳ2· because         6 3 3 6 3 3 0 ȳ·· y··  3 3 0   ȳ1· − ȳ··  =  y1·  =  3 3 0   ȳ1·  . 3 0 3 y2· 3 0 3 ȳ2· ȳ2· − ȳ·· Thus, the OLS estimator of Cβ = [0, 1, −1]β = τ1 − τ2 is     ȳ·· 0 Cβ̂ 1 = [0, 1, −1]  ȳ1· − ȳ··  = ȳ1· − ȳ2· = [0, 1, −1]  ȳ1·  = Cβ̂ 2 . ȳ2· − ȳ·· ȳ2· c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 11 / 17 Treatment Effects Model (continued)  1/6  0 Let (X0 X)− 1 = 0    0 0 0 0 0  0 1/3 1/6 −1/6  and (X0 X)− 0 . 2 = −1/6 1/6 0 0 1/3 − 0 It is straightforward to verify that (X0 X)− 1 and (X X)2 are each 0 generalized inverses of X X. − 0 0 0 It is also easy to show that β̂ 1 = (X0 X)− 1 X y and β̂ 2 = (X X)2 X y. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 12 / 17 Treatment Effects Model (continued)  PX    = X(X0 X)− X0 =      =        0 0 0 0 0 0 1 1 1 1 1 1  1 1 1 0 0 0 1/3 0  1/3 0   1  1/3 0   1 0 1/3   0 0 1/3  0 1/3 c Copyright 2012 Dan Nettleton (Iowa State University) 0 0 0 1 1 1     0   0   0   0 0   1/3 0   0 1/3    1 1 0 1 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 3 1 3 1 3    1  0 =  0  1  0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 3 1 3 1 3 1 3 1 3 1 3 0 0 0 0 0 0 0        0 0 0 0 0 0 0 0 0 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 Statistics 511     .    13 / 17 Treatment Effects Model (continued) Thus     d E(y) = ŷ = PX y =     1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3  0 0 0 y11  y12 0 0 0    0 0 0    y13 1 1 1  0 0 0 3 3 3   y21 0 0 0 13 31 13   y22 y23 0 0 0 13 31 13         =       ȳ1· ȳ1· ȳ1· ȳ2· ȳ2· ȳ2·         is our OLS estimator of     E(y) = Xβ =     1 1 1 1 1 1 c Copyright 2012 Dan Nettleton (Iowa State University) 1 1 1 0 0 0 0 0 0 1 1 1       µ      τ1  =     τ2    µ + τ1 µ + τ1 µ + τ1 µ + τ2 µ + τ2 µ + τ2     .    Statistics 511 14 / 17 Treatment Effects Model (continued) Also, we can see that the OLS estimator of     µ  = [0, 1, −1]  τ1  = [1, 0, 0, −1, 0, 0]    τ2   τ1 − τ2     = [1, 0, 0, −1, 0, 0]     c Copyright 2012 Dan Nettleton (Iowa State University) µ + τ1 µ + τ1 µ + τ1 µ + τ2 µ + τ2 µ + τ2 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1     µ    τ1    τ2       = [1, 0, 0, −1, 0, 0]E(y) is    Statistics 511 15 / 17 Treatment Effects Model (continued) d = [1, 0, 0, −1, 0, 0]ŷ [1, 0, 0, −1, 0, 0]E(y)     = [1, 0, 0, −1, 0, 0]     ȳ1· ȳ1· ȳ1· ȳ2· ȳ2· ȳ2·         = ȳ1· − ȳ2· c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 16 / 17 The Gauss-Markov Theorem Under the Gauss-Markov Linear Model, the OLS estimator c0 β̂ of an estimable linear function c0 β is the unique Best Linear Unbiased Estimator (BLUE) in the sense that Var(c0 β̂) is strictly less than the variance of any other linear unbiased estimator of c0 β for all β ∈ IRp and all σ 2 ∈ IR+ . The Gauss-Markov Theorem says that if we want to estimate an estimable linear function c0 β using a linear estimator that is unbiased, we should always use the OLS estimator. In our simple example of the treatment effects model, we could have used y11 − y21 to estimate τ1 − τ2 . It is easy to see that y11 − y21 is a linear estimator that is unbiased for τ1 − τ2 , but its variance is clearly larger than the variance of the OLS estimator ȳ1· − ȳ2· (as guaranteed by the Gauss-Markov Theorem). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 17 / 17

Document 10639673

Related documents

Products

Support

Document 10639673

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib