Document 10639942

advertisement
Reparameterization in Testing
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
1 / 26
Example:
Suppose
yi = β1 x1i + β2 x2i + εi
(i = 1, . . . , n),
where
i.i.d.
ε1 , . . . , εn ∼ N(0, σ 2 ).
Consider testing
H0 : β1 = β2 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
2 / 26
H0 : β1 = β2 ⇐⇒ H0 : Cβ = 0,
where
C = [1, −1],
c
Copyright 2012
Dan Nettleton (Iowa State University)
and β =
" #
β1
β2
.
Statistics 611
3 / 26
Let
 
xj1
.
.
xj = 
.
xjn
for j = 1, 2.
Then
X = [x1 , x2 ].
"
Suppose
1
#
−1
∈ C(X0 ) so that
H0 : β1 = β2
is testable.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
4 / 26
Provide the RNE for the restriction imposed by the null hypothesis.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
5 / 26
 
 

x1 y
b1
x01 x1 x01 x2 1
 
 

0
0
x x1 x x2 −1 b2  = x2 y
2
 
 
 2
0
λ
1
−1
0
#" #
" #
" 0
X X C0 b
X0 y
=
.
C
0
λ
d
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
6 / 26
One way to find the BLUE of β subject to β1 = β2 is to solve these
equations.
Can you think of an easier way?
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
7 / 26
If β1 = β2 , we can simply the model to
yi = βx1i + βx2i + εi
= β(x1i + x2i ) + εi .
y = [x1 + x2 ]β + ε.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
8 / 26
The BLUE of β is
[(x1 + x2 )0 (x1 + x2 )]−1 (x1 + x2 )0 y
x01 y + x02 y
= 0
x1 x1 + 2x01 x2 + x02 x2
≡ β̂.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
9 / 26
Thus, the BLUE of β subject to the constraint β1 = β2 is
" #
β̂
" #
1
x01 y + x02 y
= 0
.
0
0
x1 x1 + 2x1 x2 + x2 x2 1
β̂
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
10 / 26
It is straightforward to verify that
" #
β̂
β̂
is the leading subvector of a
solution to the RNE.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
11 / 26
This is a special case of a general reparametrization strategy.
Suppose
H0 : Cβ = d
is testable.
The set of al β satisfying H0 is
Θ0 = {C− d + (I − C− C)γ : γ ∈ Rp }.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
12 / 26
Thus,
y = Xβ + ε,
where β is constrained to H0 : Cβ = d is equivalent to
y = X[C− d + (I − C− C)γ] + ε,
γ ∈ Rp
⇐⇒
y − XC− d = X(I − C− C)γ + ε,
c
Copyright 2012
Dan Nettleton (Iowa State University)
γ ∈ Rp .
Statistics 611
13 / 26
The model
y − XC− d = X(I − C− C)γ + ε,
γ ∈ Rp .
is unconstrained with response vector y − XC− d and design matrix
X(I − C− C).
Thus,
γ̂ = [(I − C− C)0 X0 X(I − C− C)]− (I − C− C)0 X0 (y − XC− d)
solves the unconstrained least squares problem
min ky − XC− d − X(I − C− C)γk2 .
γ∈Rp
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
14 / 26
Now
min ky − XC− d − X(I − C− C)γk2
γ∈Rp
⇐⇒ minp ky − X[C− d + (I − C− C)γ]k2
γ∈R
⇐⇒ min ky − Xβk2 .
β∈Θ0
Thus,
β̃ = C− d + (I − C− C)γ̂
solves
min ky − Xβk2 .
β∈Θ0
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
15 / 26
Show how this works for our simple example.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
16 / 26
In our example, C = [1, −1] and d = 0.
A generalized inverse for C is
−
C =
" #
1
.
0
Thus,
"
−
I−C C =
#
"
−
0 1
"
=
1 0
0 1
1 −1
0
#
0
#
.
0 1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
17 / 26
If follows that
−
X(I − C C) = [x1 , x2 ]
"
#
0 1
0 1
= [0, x1 + x2 ]
and
i0 h
i
h
0 x1 + x2 0 x1 + x2
"
#
0
0
=
.
0 kx1 + x2 k2
(I − C− C)0 X0 X(I − C− C) =
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
18 / 26
"
γ̂ =
0
#−
0 kx1 + x2 k2
"
=
0
0
0
0 kx1 + x2 k−2
"
#
0
=
x0 y+x0 y .
1
h
i0
0 x1 + x2 (y − 0)
#"
0
#
x01 y + x02 y
2
kx1 +x2 k2
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
19 / 26
Thus,
β̃ = 0 +
=
"
#"
0 1
0 1
0
#
x01 y+x02 y
kx1 +x2 k2
" #
x01 y + x02 y 1
,
kx1 + x2 k2 1
which matches our previous result.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
20 / 26
In practice, when testing
H0 : Cβ = d,
we don’t often explicitly solve the RNE.
It is more common to reparameterize and carry out an unconstrained
maximization for the reparameterized model.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
21 / 26
We then use
(SSEReduced − SSEFull ) /(DFR − DFF )
SSEFull /DFF
as our test statistics, where SSEReduced is the SSE from the
reparameterized model with
DFR = n − rank(X(I − C− C)).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
22 / 26
We have shown that
ky − XC− d − X(I − C− C)γk2 = ky − Xβ̃k2 .
Thus, SSE for the reparameterized model is Q(β̃), the SSE for the
constrained model.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
23 / 26
Furthermore, it can be shown that
rank(X(I − C− C)) = rank(X) − rank(C)
= r − q.
Thus, DF for the SSE in the parameterized model is
n−r+q
so that
DFR − DFF = n − r + q − (n − r) = q.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
24 / 26
It follows that
[Q(β̃) − Q(β̂)]/q
(SSEReduced − SSEFull ) /(DFR − DFF )
=
.
SSEFull /DFF
Q(β̂)/(n − r)
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
25 / 26
Thus, the reparameterization strategy is yet another way to arrive at
the general linear test or, equivalently, the LRT of
c
Copyright 2012
Dan Nettleton (Iowa State University)
H0 : Cβ = d.
Statistics 611
26 / 26
Download