Document 10639937

advertisement
Cochran’s Theorem and Analysis of
Variance
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
1 / 34
Theorem 5.1:(Cochran’s Theorem)
Suppose Y ∼ N(µ, σ 2 I) and A1 , . . . , Ak are symmetric and idempotent
matrices with
rank(Ai ) = si
Then
Pk
i=1 Ai
=n×n
I =⇒
1 0
Y Ai Y
σ2
∀ i = 1, . . . , k.
(i = 1, . . . , k) are independently
distributed as χ2si (φi ), with
φi =
k
X
1 0
µ
A
µ,
si = n.
i
2σ 2
i=1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
2 / 34
Proof of Theorem 5.1:
By Result 5.15,
1 0
Y Ai Y ∼ χ2si (φi )
σ2
with
φi =
∵
1
A σ2I
σ2 i
1 0
µ Ai µ
2σ 2
= Ai is idempotent and has rank si
c
Copyright 2012
Dan Nettleton (Iowa State University)
∀ i = 1, . . . , k.
Statistics 611
3 / 34
k
X
si =
i=1
=
k
X
i=1
k
X
rank(Ai )
trace(Ai )
i=1
= trace(
k
X
Ai )
i=1
= trace(n×n
I)
= n.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
4 / 34
By Lemma 5.1, ∃ Gi 3
n×si
Gi G0i = Ai , G0i Gi = I
si ×si
∀ i = 1, . . . , k.
Now let G = [G1 , . . . , Gk ].
∵ Gi is n × si
∀ i = 1, . . . , k and
c
Copyright 2012
Dan Nettleton (Iowa State University)
Pk
i=1 si
= n, it follows that G is n × n.
Statistics 611
5 / 34
Moreover,
h
GG0 = G1
=
k
X
i=1
 
G0
i  1
.. 
. . . Gk 
 . 
G0k
Gi G0i =
k
X
Ai = I.
i=1
Thus, n×n
G has G0 as its inverse; i.e., G−1 = G0 . Thus G0 G = I.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
6 / 34
Now we have
 
G01
h
i

.. 
 G1 · · · Gk
I = G0 G = 
.
 
G0k


G01 G1 G01 G2 . . . G01 Gk
 0

G G1 G0 G2 . . . G0 Gk 
2
2
 2

= .
.
..
.. 
..
 ..
.
.
. 


G0k G1 G0k G2 . . . G0k Gk
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
7 / 34
∴ G0i Gj = 0
∀ i 6= j.
∴ Gi G0i Gj G0j = 0 ∀ i 6= j
∴ Ai Aj = 0 ∀ i 6= j
∴ σ 2 Ai Aj = 0
∀ i 6= j
∴ Ai (σ 2 I)Aj = 0 ∀ i 6= j
∴ Independence, hold by Corollary 5.4, ∀ pair of
quadratic forms Y 0 Ai Y/σ 2 and Y 0 Aj Y/σ 2 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
8 / 34
However, we can prove more than pairwise independence.
  

G01 Y
G01
h
i0
 .   . 
 ..  =  ..  Y = G1 . . . Gk Y = G0 Y
  

0
Gk Y
G0k
∼ N(G0 µ, G0 (σ 2 I)G = σ 2 G0 G = σ 2 I).
By Result 5.4, G01 Y, . . . , G0k Y are mutually independent.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
9 / 34
G01 Y, . . . , G0k Y mutually independent,
=⇒ kG01 Yk2 , . . . , kG0k Yk2 mutually independent
=⇒ Y 0 G1 G01 Y, . . . , Y 0 Gk G0k Y mutually independent
=⇒ Y 0 A1 Y/σ 2 , . . . , Y 0 Ak Y/σ 2 mutually independent.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
10 / 34
Example:
i.i.d.
Suppose Y1 , . . . , Yn ∼ N(µ, σ 2 ).
Find the joint distribution of nȲ·2 and
c
Copyright 2012
Dan Nettleton (Iowa State University)
Pn
i=1 (Yi
− Ȳ· )2 .
Statistics 611
11 / 34
Let A1 = P1 = 1(10 1)− 10 = 1n 110 .
Let A2 = I − P1 = I − 1n 110 .
Then
rank(A1 ) = 1
and
rank(A2 ) = n − 1.
Also, A1 and A2 are each symmetric and idempotent matrices 3
A1 + A2 = P1 + I − P1 = I.
Let Y = (Y1 , . . . , Yn )0 3 E(Y) = µ1, Var(Y) = σ 2 I.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
12 / 34
Cochran’s Thenorem =⇒
IND
1 0
Y Ai Y ∼
σ2
si = rank(Ai ) and φi =
χ2si (φi ), where
1 0
µ2 0
µ
A
µ
=
1 Ai 1.
i
2σ 2
2σ 2
For i = 1, we have
1 0 0
n
Y 11 Y/n = 2 Ȳ·2 , s1 = 1,
2
σ
σ µ2 0 1 0
nµ2
φ1 = 2 1
11 1 = 2 .
2σ
n
2σ
c
Copyright 2012
Dan Nettleton (Iowa State University)
and
Statistics 611
13 / 34
For i = 2, we have
1 0
1
Y A2 Y = 2 Y 0 A02 A2 Y
2
σ
σ
1
= 2 kA2 Yk2
σ
2
1 0
1 = 2 I − 11 Y σ
n
1
= 2 kY − 1Ȳ· k2
σ
n
1 X
(Yi − Ȳ· )2 .
= 2
σ
c
Copyright 2012
Dan Nettleton (Iowa State University)
i=1
Statistics 611
14 / 34
Also, s2 = n − 1 and
µ2 0
1 0
φ2 = 2 1 I − 11 1
2σ
n
2
µ
1
= 2 10 1 − 10 110 1
2σ
n
2
2
µ
n
= 2 n−
= 0.
2σ
n
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
15 / 34
Thus,
nȲ·2
∼
σ 2 χ21
nµ2
2σ 2
independent of
n
X
(Yi − Ȳ· )2 ∼ σ 2 χ2n−1 .
i=1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
16 / 34
It follows that
nȲ·2 /σ 2
Pn
2
i=1 (Yi −Ȳ· )
n−1
=
/σ 2
nȲ·2
Pn
2
i=1 (Yi −Ȳ· )
n−1
∼ F1,n−1
c
Copyright 2012
Dan Nettleton (Iowa State University)
nµ2
2σ 2
.
Statistics 611
17 / 34
If µ = 0,
nȲ·2
Pn
(Y −Ȳ· )2
i=1 i
n−1
∼ F1,n−1 .
Thus, we can test H0 : µ = 0 by comparing
nȲ·2
Pn
(Y −Ȳ· )2
i=1 i
n−1
to F1,n−1
distribution and rejecting H0 for large values.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
18 / 34
Note
nȲ·2
Pn
2
i=1 (Yi −Ȳ· )
= t2 ,
n−1
where
Ȳ·
t= p
,
s2 /n
the usual t-statistics for testing H0 : µ = 0.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
19 / 34
Example:
ANalysis Of VAriance (ANOVA):
Consider Y = Xβ + ε, where

β1

 
β 
 2
X = [X1 , X2 , . . . , Xm ], β =  .  , ε ∼ N(0, σ 2 I).
 .. 
 
βm
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
20 / 34
Let Pj = P[X1 ,...,Xj ] for j = 1, . . . , m.
Let
A1 = P 1
A2 = P 2 − P 1
A3 = P 3 − P 2
..
.
Am = Pm − Pm−1
Am+1 = I − Pm .
Note that
Pm+1
j=1
Aj = I.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
21 / 34
Then Aj is symmetric and idempotent ∀ j = 1, . . . , m + 1.
s1 = rank(A1 ) = rank(P1 )
s2 = rank(A2 ) = rank(P2 ) − rank(P1 )
..
.
sm = rank(Am ) = rank(Pm ) − rank(Pm−1 )
sm+1 = rank(Am+1 ) = rank(I) − rank(Pm )
c
Copyright 2012
Dan Nettleton (Iowa State University)
= n − rank(X).
Statistics 611
22 / 34
It follows that
1 0
IND
Y Aj Y ∼ χ2sj (φj ),
2
σ
∀ j = 1, . . . , m + 1,
where
φj =
1 0 0
β X Aj Xβ.
2σ 2
Note
1 0 0
β X (I − PX )Xβ
2σ 2
1
= 2 β 0 X0 (X − PX X)β
2σ
1
= 2 β 0 X0 (X − X)β = 0.
2σ
φm+1 =
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
23 / 34
Thus,
1 0
Y Am+1 Y ∼ χ2n−rank(X)
σ2
and
Y 0 Aj Y/sj
Y 0 Am+1 Y/(n − rank(X))
1 0 0
β X Aj Xβ
∼ Fsj ,n−rank(X)
2σ 2
Fj =
c
Copyright 2012
Dan Nettleton (Iowa State University)
∀ j = 1, . . . , m.
Statistics 611
24 / 34
We can assemble the ANOVA table as below:
Source
Sum of
DF
Mean
Squares
Square
Expected
F
Mean Square
A1
..
.
Y A1 Y
..
.
s1
..
.
Y A1 Y/s1
..
.
σ2
+ β 0 X0 A1 Xβ/s1
..
.
F1
..
.
Am
Y 0 Am Y
sm
Y 0 Am Y/sm
σ 2 + β 0 X0 Am Xβ/sm
Fm
Y Am+1 Y/sm+1
σ2
0
0
Am+1
Y Am+1 Y
sm+1
I
Y 0 IY
n
0
0
This ANOVA table contains sequential (a.k.a type I) sum of squares.
Am+1 corresponds to “error.”
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
25 / 34
We can use Fj to test
H0j :
1 0 0
β X Aj Xβ = 0 ⇐⇒
2σ 2
β 0 X0 A0j Aj Xβ = 0 ⇐⇒
H0j :
Aj Xβ = 0.
H0j :
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
26 / 34
Now


β1
 . 
. 
Aj Xβ = Aj [X1 , . . . , Xm ] 
 . 
βm
= Aj
m
X
Xi βi
i=1
=
m
X
Aj Xi β i
i=1
m
X
=
(Pj − Pj−1 )Xi β i
∀ j = 1, . . . , m (where P0 = 0 ).
i=1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
27 / 34
Recall Pj = P[X1 ,...,Xj ] .
Thus, Pj Xi = Xi
∀ i ≤ j.
It follows that
(Pj − Pj−1 )Xi = 0 whenever i ≤ j − 1.
Therefore
m
X
Aj Xβ =
(Pj − Pj−1 )Xi β i
i=1
m
X
=
(Pj − Pj−1 )Xi β i .
i=j
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
28 / 34
For j = m, this simplifies to
Am Xβ = (Pm − Pm−1 )Xm β m
= (I − Pm−1 )Pm Xm β m
= (I − Pm−1 )Xm β m .
Now
(I − Pm−1 )Xm β m = 0
⇐⇒ Xm β m ∈ N (I − Pm−1 )
⇐⇒ Xm β m ∈ C(Pm−1 )
⇐⇒ Xm β m ∈ C([X1 , . . . , Xm−1 ]).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
29 / 34
Now
Xm β m ∈ C([X1 , . . . , Xm−1 ])
⇐⇒ E(Y) = Xβ
=
m
X
Xi βi
i=1
∈ C([X1 , . . . , Xm−1 ])
⇐⇒ Explanatory variables in Xm are irrelevant in the presence of
X1 , . . . , Xm−1 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
30 / 34
For the special case where X is of full column rank,
Xm β m ∈ C([X1 , . . . , Xm−1 ])
⇐⇒ Xm β m = 0
⇐⇒ β m = 0.
Thus, in this full column rank case, Fm tests
c
Copyright 2012
Dan Nettleton (Iowa State University)
H0m : β m = 0.
Statistics 611
31 / 34
Explain what Fj tests for the special case where
X0k Xk∗ = 0
c
Copyright 2012
Dan Nettleton (Iowa State University)
∀ k 6= k∗ .
Statistics 611
32 / 34
Now suppose
X0k Xk∗ = 0
∀ k 6= k∗ .
Then
Pj X i =

Xi
for i ≤ j
0
for i > j

  
X01 Xi
0



..   .. 

∵ [X1 , . . . , Xj ]0 Xi = 
 .  = .
X0j Xi
0
for i > j.
It follows that
Aj Xβ =
m
X
(Pj − Pj−1 )Xi β i = Pj Xj β j = Xj β j .
i=1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
33 / 34
Thus, for the orthogonal case, Fj can be used to test
H0j : Xj β j = 0
∀ j = 1, . . . , m.
If we have X full column rank in addition to the orthogonality condition,
Fj tests
H0j : β j = 0
c
Copyright 2012
Dan Nettleton (Iowa State University)
∀ j = 1, . . . , m.
Statistics 611
34 / 34
Download