Reparameterization in Testing c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 26 Example: Suppose yi = β1 x1i + β2 x2i + εi (i = 1, . . . , n), where i.i.d. ε1 , . . . , εn ∼ N(0, σ 2 ). Consider testing H0 : β1 = β2 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 26 H0 : β1 = β2 ⇐⇒ H0 : Cβ = 0, where C = [1, −1], c Copyright 2012 Dan Nettleton (Iowa State University) and β = " # β1 β2 . Statistics 611 3 / 26 Let xj1 . . xj = . xjn for j = 1, 2. Then X = [x1 , x2 ]. " Suppose 1 # −1 ∈ C(X0 ) so that H0 : β1 = β2 is testable. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 26 Provide the RNE for the restriction imposed by the null hypothesis. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 26 x1 y b1 x01 x1 x01 x2 1 0 0 x x1 x x2 −1 b2 = x2 y 2 2 0 λ 1 −1 0 #" # " # " 0 X X C0 b X0 y = . C 0 λ d c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 26 One way to find the BLUE of β subject to β1 = β2 is to solve these equations. Can you think of an easier way? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 26 If β1 = β2 , we can simply the model to yi = βx1i + βx2i + εi = β(x1i + x2i ) + εi . y = [x1 + x2 ]β + ε. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 26 The BLUE of β is [(x1 + x2 )0 (x1 + x2 )]−1 (x1 + x2 )0 y x01 y + x02 y = 0 x1 x1 + 2x01 x2 + x02 x2 ≡ β̂. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 9 / 26 Thus, the BLUE of β subject to the constraint β1 = β2 is " # β̂ " # 1 x01 y + x02 y = 0 . 0 0 x1 x1 + 2x1 x2 + x2 x2 1 β̂ c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 26 It is straightforward to verify that " # β̂ β̂ is the leading subvector of a solution to the RNE. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 26 This is a special case of a general reparametrization strategy. Suppose H0 : Cβ = d is testable. The set of al β satisfying H0 is Θ0 = {C− d + (I − C− C)γ : γ ∈ Rp }. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 26 Thus, y = Xβ + ε, where β is constrained to H0 : Cβ = d is equivalent to y = X[C− d + (I − C− C)γ] + ε, γ ∈ Rp ⇐⇒ y − XC− d = X(I − C− C)γ + ε, c Copyright 2012 Dan Nettleton (Iowa State University) γ ∈ Rp . Statistics 611 13 / 26 The model y − XC− d = X(I − C− C)γ + ε, γ ∈ Rp . is unconstrained with response vector y − XC− d and design matrix X(I − C− C). Thus, γ̂ = [(I − C− C)0 X0 X(I − C− C)]− (I − C− C)0 X0 (y − XC− d) solves the unconstrained least squares problem min ky − XC− d − X(I − C− C)γk2 . γ∈Rp c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 26 Now min ky − XC− d − X(I − C− C)γk2 γ∈Rp ⇐⇒ minp ky − X[C− d + (I − C− C)γ]k2 γ∈R ⇐⇒ min ky − Xβk2 . β∈Θ0 Thus, β̃ = C− d + (I − C− C)γ̂ solves min ky − Xβk2 . β∈Θ0 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 26 Show how this works for our simple example. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 16 / 26 In our example, C = [1, −1] and d = 0. A generalized inverse for C is − C = " # 1 . 0 Thus, " − I−C C = # " − 0 1 " = 1 0 0 1 1 −1 0 # 0 # . 0 1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 26 If follows that − X(I − C C) = [x1 , x2 ] " # 0 1 0 1 = [0, x1 + x2 ] and i0 h i h 0 x1 + x2 0 x1 + x2 " # 0 0 = . 0 kx1 + x2 k2 (I − C− C)0 X0 X(I − C− C) = c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 26 " γ̂ = 0 #− 0 kx1 + x2 k2 " = 0 0 0 0 kx1 + x2 k−2 " # 0 = x0 y+x0 y . 1 h i0 0 x1 + x2 (y − 0) #" 0 # x01 y + x02 y 2 kx1 +x2 k2 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 26 Thus, β̃ = 0 + = " #" 0 1 0 1 0 # x01 y+x02 y kx1 +x2 k2 " # x01 y + x02 y 1 , kx1 + x2 k2 1 which matches our previous result. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 20 / 26 In practice, when testing H0 : Cβ = d, we don’t often explicitly solve the RNE. It is more common to reparameterize and carry out an unconstrained maximization for the reparameterized model. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 26 We then use (SSEReduced − SSEFull ) /(DFR − DFF ) SSEFull /DFF as our test statistics, where SSEReduced is the SSE from the reparameterized model with DFR = n − rank(X(I − C− C)). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 26 We have shown that ky − XC− d − X(I − C− C)γk2 = ky − Xβ̃k2 . Thus, SSE for the reparameterized model is Q(β̃), the SSE for the constrained model. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 23 / 26 Furthermore, it can be shown that rank(X(I − C− C)) = rank(X) − rank(C) = r − q. Thus, DF for the SSE in the parameterized model is n−r+q so that DFR − DFF = n − r + q − (n − r) = q. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 26 It follows that [Q(β̃) − Q(β̂)]/q (SSEReduced − SSEFull ) /(DFR − DFF ) = . SSEFull /DFF Q(β̂)/(n − r) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 26 Thus, the reparameterization strategy is yet another way to arrive at the general linear test or, equivalently, the LRT of c Copyright 2012 Dan Nettleton (Iowa State University) H0 : Cβ = d. Statistics 611 26 / 26