Spring 2002 Solutions to

advertisement
Spring 2002
Solutions to Assignment 5
The rst two parts of this problem were just exercises in applying the formulas to compute momemts of quadratic
forms. Here we make use of S-Plus to perform the calculations.
(Result 4.6)
(a) E (Y T AY ) = T A + tr(A).
1.
>
>
>
>
>
mu <- c(-1,0,-3);
sigma <- matrix(c(.75,.00,.25,.00,1.00,.00,.25,.00,.75),3,3)
A <- matrix(c(1.5,0.0,-0.5,0.0,1.0,0.0,-0.5,0.0,1.5),3,3)
E <- t(mu)%*%A%*%mu+sum(diag(A%*%sigma))
E
[,1]
[1,] 15
(b)
var(Y T AY ) = 4T AA + 2tr(AA).
> Var <- 4*t(mu)%*%A%*%sigma%*%A%*%mu+2*sum(diag(A%*%sigma%*%A%*%sigma))
> Var
[,1]
[1,] 54
(c)
We can see that A is symmetric, and is symmetric and positive denite. Next we need to check if A is
idempotent.
> V1 <- A%*%sigma%*%A%*%sigma
> V1
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
> V2 <- A%*%sigma
> V2
[,1] [,2] [,3]
[1,]
1
0
0
[2,]
0
1
0
[3,]
0
0
1
(d)
It turns out that AA = A,so we conclude yT Ay has a noncentral chi-square distribution with 3 degrees of
freedom and noncentrality parameter T A = 12.
Now we need to check if (A=2 )2 I = A is idempotent. It turns out AA 6= A. So we conclude yT Ay=2 does
not have a chi-square distribution.
> A%*%A
[,1] [,2] [,3]
[1,] 2.5
0 -1.5
[2,] 0.0
1 0.0
[3,] -1.5
0 2.5
> A
[,1] [,2] [,3]
[1,] 1.5
0 -0.5
[2,] 0.0
1 0.0
[3,] -0.5
0 1.5
1
In this case, Y N (1 ; 2 I ), where Y = (Y1 ; : : : ; Yn )T .
Since I P1 is a projection matrix, it is symmetric and idempotent.
(a) We know that (I P1 )Y = Y Y 1 .
Then,
2.
1
.
.
Y T (I
n 1
P1 )Y =
1
Y T (I
n 1
P1 )T (I
P1 )Y =
1 ((I P )Y )T ((I P )Y )
1 1 n 1
Consequently,
n
1 ((I P )Y )T ((I P )Y ) = 1 (Y y 1 )T (Y y 1 ) = 1 X
(Y Y )2 = S 2
1 1 n 1
n 1 n 1 i=1 i
E (Y T AY ) = T A + tr(A).
(Result 4.6)
1 [( 1 )T (I P )( 1 ) + tr((I P )2 I )]
1 1
= n 1 1 [2 (1T 1 1 T 1 (1T 1 ) 1 1 T 1 ) + 2 tr(I P1 )]
= n 1 1 [ 0 + 2 rank(I P1 )]
= n 1 1 [2 (n 1)]
= 2
E (S 2 ) =
n 1
V ar(S 2 ) = 4T AA + 2tr(AA)
I P1
I P1
I P1
I P1
= 4((1 )T n 1 2 I n 1 1 ) + 2tr( n 1 2 I n 1 2 I )
= (n 4 1)2 2 2 (1)T (I P1 )1 + (n 2 1)2 4 tr(I P1 )
4
= n2 1
(b) Y N (1 ; 2 I ); Let A = 12 (I P1 ), then A = I P1 is idempotent from part (a), and (I P1 ) is
symmetric and is symmetric
and positive denite. Hence, the conditions of Result 4.7 are satised and
(n 1)S 2
T
2
we have Y AY = 2 (n 1) (Æ2 ), where the degrees of freedom are rank(I P1 )=n-1. Here, the non2
centrality parameter is Æ2 = T A = (1 )T (I P1 )(1 ) = 0, and (n 1)2 S 2(n 1) . Then, from the moments
of a central chi-square random variable, we have
n 1 = E(
(n 1)S 2 ) = (n 1) E (S 2 )
2
2
and it follows that E (S 2 ) = 2 .
From the variance of a central chi-square random variable, we have
2
2
2(n 1) = V ar( (n 21)S ) = (n 41) V ar(S 2 )
and it follows that V ar(S 2 ) = (n241) . These match the results from part (a).
2
n
P
2
Yi ] = Y T P1 Y . To apply Result 4.8, check that
i=1
(c) Note that nY 2 = n1 [
2
A1 A2 = P1 (2 I ) (n 1 1) (I P1 ) = (n 1) P1 (I P1 ) = 0. Consequently, since P1 and I P1 are both symmetric,
the conditions of Result 4.8 are satised and we have established that nY 2 and S 2 are independent.
(d) >From the results above we can write
nY 2 = Y T P1 Y and (n 1)S 2 = Y T (I
P1 )Y
Since I = P1 +(I P1 ), n = rank(P1 )+ rank(I P1 ), and P1 and I P1 are symmetric, by Cochran's Theorem
(n 1)S 2 2
n
2
and
1
1 Y T P Y 2 (Æ2 ) where Æ2 = 1 ( 1 )T P ( 1 ) = n2
1
2 1 2 1 2
and these random quantities are distributed independently. It follows that
F=
[ 12 Y T P 1 Y ]=1
1
[ 2 Y T (I P 1 )Y ]=(n 1)
= nYS2 F(1;n 1) ( n
2 ).
2
2
n2
2
(e) Æ2 = : Since 2 > 0 and n > 0 , then Æ2 = 0 if and only if = 0 . Consequently, we can test Ho : = 0
against Ha : 6= 0, and Ho is rejected if F F (; 1; n 1).
3.(a) To show H0 is testable, we need show (1) the linear combinations of parameters that are restricted by the
null hypothesis (C ) is estimable and (2) the rank of the C matrix is equal to the number of rows in C. To show C
is estimable you can use the denition of estimability or result 3.8 or 3.9 from the course notes. For example, C is
estimable if and only if for any Xd = 0, we have Cd = 0. In this case, the only solution to Xd = 0 is dT = k[ 1 1 1 1],
where k is a constant, and we have to check if C [ 1 1 1 1]T = 0.
estimable(Cd = 0) rank(C)=rows testable
function
C
(i)
1 = 2
[0 1 -1 0]
Yes
(ii)
1
[0 1 -2 3]
No
No
(iii) 3 = 0
[0 0 0 1]
No
No
(iv) = 0
[1 0 0 0]
No
No
22 + 33 = 0
0 1 0 1
(v) 1 = 3 and 1 22 + 3 = 0
0 1 2 1
2
0 1 1 0
(vi) 1 = 2 = 3 and21 2 3 = 0 4 0 1 0 1
0 2 1 1
In this case, the last row of C is the sum of the rst two rows of C.
(b)First note that the null hypothesis H0 : 00 11 02 11
3
3
5
2
3
Yes
Yes
Yes
Yes
Yes
Yes
No
No[.1in]
6 1 7
6
7 = 0 is testable.
4 2 5
0
3
(i) Note that V ar(Y )=2 I is positive denite, I P2 x is symmetric, I P2 x 2 I = I Px is idempotent, and Æ2 =
(X )T I 2Px X = 0. Consequently, the conditions of Result 4.7 are satised and SSE has a central chi-square
distribution with 5 d.f. (rank(I Px )=rank(I -rank(X )=8-3=5).
(ii) Dene SSH0
= (Cb)T [C (X T X )
CT ]
1
Cb, where C = 00 11 02 11 .
Note that, Cov(Cb; (I Px )Y ) = Cov(C (X T X ) X T Y ; (I Px )Y ) = C (X T X ) X T 2 I (I Px )T
= 2 C (X T X ) X T (I Px )T = 0, since X T (I Px )T = 0. Furthermore, the joint distribution of
"
# "
# C (X T X ) X T Y
Cb
C (X T X ) X T Y
=
=
(I Px )Y
(I Px )Y
(I Px )
is multivariate normal (by result 4.1) because it is a linear function of a multivariate normal random vector Y .
Consequently, C (X T X ) X T Y and (I Px )Y are independent random vectors. Since SSE is a function only
of (I Px )Y while SSH0 is a function only of Cb, it follows that they are independently distributed.
Or we can use result 4.8 from the notes. First dene
SSH0 = (Cb)T [C (X T X ) C T ] 1 Cb = Y T X (X T X ) C T [C (X T X ) C T ] 1 C (X T X ) X T y = Y T A0 Y :
The condition of Result 4.8 is satised because
(I Px )2 IA0 = (I Px )X (X T X ) C T [C (X T X ) C T ] 1 C (X T X ) X T = 0;
since (I Px )X = 0 : Therefore, SSE and SSH0 are independently distributed.
(iii) We know that Cb N (C; 2 C (X T X ) C T ) and 12 [C (X T X ) C T ] 1 V ar(Cb) = I is symmetric and idem2 2
0
potent. It follows from Result 4.7 that SSH
2 2 (Æ ), where 2=rank(C) provides the degrees of freedom. The
non-centrality parameter is
1
Æ2 = 2 T C T (C (X T X ) C T ) 1 C :
(iv) It follows from the denition of the F-distribution that
SSH0
0 =2
2 2
F = SSE
= SSH
SSE=
5
5 2
F
;
(2 5)
(Æ 2 )
where Æ2 = 12 T C T (C (X T X ) C T ) 1 C = 12 (1:521 + 222 + 1:523 21 2 1 3 223 ). This was
obtained by using S-Plus to evaluate C T (C (X T X ) C T ) 1 C .
>
>
>
>
X<-matrix(c(rep(1,10),rep(0,8),rep(1,4),rep(0,8),1,1),ncol=4,byrow=F)
XTX<-t(X)%*%X
c<-matrix(c(0,1,0,-1,0,1,-2,1),ncol=4,byrow=T)
t(c)%*%solve(c%*%ginverse(XTX)%*%t(c))%*%c
[,1] [,2] [,3] [,4]
[1,]
0 0.0
0 0.0
[2,]
0 1.5 -1 -0.5
[3,]
0 -1.0
2 -1.0
[4,]
0 -0.5 -1 1.5
4
(v) Obviously, when the null hypothesis is true we have C = 0, and Æ2 =
Consequently, the test statistic has a central F-distribution.
1
T
T
2 (C ) (C (X X )
C T ) 1 C = 0.
(c) The null hypothesis is H0 : 1 = 2 = 3 . Itwas incorrectly printed on the rst version of the assignment
that was posted on the course web page. Here, C = 00 10 01 11 , whose rows span the same space as the rows
of C, i.e.,
2
3
2
3
2
3
2
3
2
3
2
3
0
0
0
0
0
0
6 1 7
6 0 7 6
7
6
7
6
7
6
7
a
(
a
+
a
=
2)
a
=
2
a
+
a
=
2
a
1
1
2
2
1
2
2 =2 7
7
6
7 6
7=6
7=6
7+6
a1 6
4 0 5 + a2 4 1 5 = 4
5 4
5 4
5 4 a2 5 = (a1 +
a2
a2
0
1 2
( a1 a2 =2) a2 =2
(a1 + a2 =2)
a2=2
21
3
3 a1 a2
0
0
6 1 7
6 1 7
7
6
7
a2 =2) 6
4 0 5 (a2 =2) 4 2 5
1
1
Therefore the null hypothesis is the same in parts (b) and (c), and the F-test in part (c) is the same as the F-test
in (b).
4. (a)
0
1
1
0
Dene C = 0 0 0 1 . Then, H0 : C = 0 , 1 2 = = 0.
(b) >From the previous assignment we know that, C is estimable if Cd = 0, where dT = w[ 1 1 1 0]. So, here
C is estimable. Also, it is easy to see that rank(C)=number of rows in C = 2. Therefore, the null hypothesis is
testable.
(c) Use SSH0 = (Cb)T [C (X T X ) C T ] 1 Cb. It follows from Result 4.7 in the notes, that 12 SSH0 22 (Æ2 ),
where Æ2 = 12 (C )T [C (X T X ) C T ] 1 C . Since C = 0, we have Æ2 = 0 and 12 SSH0 22 . See the solution to
question 1.(b) for details.
(d) Check the conditions of Result 4.7 in the notes. Note that 12 (I Px )2 I = (I Px ) is idempotent and
symmetric, Æ2 = (X )T (I P2 x ) X = 0, and rank(I Px )=rank(I)-rank(Px)=10-3=7. Therefore, 12 SSE 27 .
(e) You can check the conditions of Result 4.8 in the notes. See the solution to question 1.(b).(ii).
SSH0 =2
0 =(2 )
2
2
(f) From the results in parts (c), (d), and (e) we have that SSH
SSE=(72 ) = SSE=7 F2;7 (Æ ), where Æ =
5
500 2
2
2
(1 2 ) + 2 . Therefore, Æ = 0 if and only if 1 = 2 and = 0. (H0 is true.) The test is performed by
2 2
0 =2
rejecting the null hypothesis if F = SSH
SSE=7 > F(2;7); .
2
(g)
>
>
>
>
>
y <x <C <SSH0
SSH0
c(20,24,27,33,38,25,29,32,37,41)
matrix(c(rep(1,15),rep(0,10),rep(1,5),rep(c(-10,-5,0,5,10),2)),10,4)
matrix(c(0,0,1,0,-1,0,0,1),2,4)
<- t(C%*%ginverse(t(x)%*%x)%*%t(x)%*%y)%*%solve(C%*%ginverse(t(x)%*%x)%*%t(C))%*%C%*%ginverse(t(x)%*%x)
[,1]
[1,] 409.65
> SSE <- crossprod(y-x%*%ginverse(t(x)%*%x)%*%t(x)%*%y,y-x%*%ginverse(t(x)%*%x)%*%t(x)%*%y)
> SSE
5
[,1]
[1,] 4.75
> V <- (SSH0/2)/(SSE/7)
> V
[,1]
[1,] 301.8474
> pvalue <- 1-pf(V,2,7)
> pvalue
[,1]
[1,] 1.612347e-07
Since the p-value is less than 0.0001, we reject the null hypothesis at the 0.0001 signicance level. The mean yields
do not appear to be the same for all combinations of catalysts and temperatures used in the experiment.
5.(a) Let
2
3
2
3
1 0 0
1 :5 :5 0
6 0 1 0 7
7
R=6
and
Q = 4 0 :5 :5 0 5
4 0
1 0 5;
0 0 0 1
0 0 1
Then, W=XR and X=WQ. Therefore, these two models are reparameterizations of each other.
(b) Since these parameters are estimable, each has an unambiguous interpretation:
1 is the mean yield of the process, averaging across the results for catalyst A and catalyst B, when the process
is run at 100oC .
2 is half the dierence between the mean yield of the process using catalyst A and the mean yield of the process
using catalyst B when both processes are run at the same temperature.
3 represents the change in the mean yield of the process when the temperature is increased 1oC and the same
catalyst (either catalyst A or catalyst B) is used.
(c) Calculations can be done in dierent ways. S-Plus code is show for each calculation.
(i) ^ = (W T W ) 1 W T Y .
>
>
>
>
w_matrix(c(rep(1,15),rep(-1,5),-10,-5,0,5,10,-10,-5,0,5,10),ncol=3,byrow=F)
y_c(20,24,27,33,38,25,29,32,37,41)
gamma.hat_solve(t(w)%*%w)%*%t(w)%*%y
gamma.hat
[,1]
[1,] 30.60
[2,] -2.20
[3,] 0.85
(ii) y^ = W ^.
> yhat <- w%*%gammahat
> yhat
[,1]
6
[1,]
[2,]
[3,]
[4,]
[5,]
[6,]
[7,]
[8,]
[9,]
[10,]
19.90
24.15
28.40
32.65
36.90
24.30
28.55
32.80
37.05
41.30
(iii) e = y y^.
> e <- y-yhat
> e
[,1]
[1,] 0.10
[2,] -0.15
[3,] -1.40
[4,] 0.35
[5,] 1.10
[6,] 0.70
[7,] 0.45
[8,] -0.80
[9,] -0.05
[10,] -0.30
(iv)-(viii)
SS d:f: MS
R(1 )
9363:6 1 9363:6
R(2 j1 )
48:4 1 48:4
R(3 j1 2 ) 361:25 1 361:25
SSE
4:75 7 :6786
>
>
>
>
>
>
# function to compute projection matrices
ject <- function(w){w%*%ginverse(t(w)%*%w)%*%t(w)}
# Identity matrix
one <- diag(rep(1,10))
SSE <- t(y)%*%(one-ject(w))%*%y
SSE
[,1]
[1,] 4.75
> r1 <- t(y)%*%ject(w1)%*%y
> r1
[,1]
[1,] 9363.6
> r2 <- t(y)%*%(ject(w2)-ject(w1))%*%y
> r2
[,1]
[1,] 48.4
> r3 <- t(y)%*%(ject(w)-ject(w2))%*%y
> r3
7
F
p value
13798 0:00
71:32 0:00
532:37 0:00
[,1]
[1,] 361.25
(d) Check the conditions of Cochran's Theorem.
1) (I Pw ) + Pw1 + (Pw2 Pw1 ) + (Pw Pw2 ) = I
2) rank(I Pw ) + rank(Pw1 ) + rank(Pw2 Pw1 ) + rank(Pw Pw2 ) = (10 3) + 1 + (2 1) + (3 2) = 10 = n
3) I Pw , Pw1 , Pw2 Pw1 , Pw Pw2 are all symmetric
Therefore, by Cochran's theorem, the sums of squares in (vi)-(vii) are independently distributed as chi-square random
variables multiplied by 2 , with d.f. 7, 1, 1, 1 respectively.
(e)
Æ2 =
1 T W T (P
2
Pw2 )W =
w
500 2
2
3
Therefore, Æ2 = 0 if and only if 3 = 0. You can use S-Plus to evaluate W T (Pw Pw2 )W :
> round(t(w)%*%(ject(w)-ject(w2))%*%w,4)
[,1] [,2] [,3]
[1,]
0
0
0
[2,]
0
0
0
[3,]
0
0 500
(f)Æ2 = 12 T W T (Pw1 Pw2 )W = 102 22 . Therefore, Æ2 = 0 if and only if 2 = 0.
(g)(i) Use S-Plus to evaluate cov(^ ) = (^2 )(W T W ) 1
> a <- SSE/7
> cov <- a*solve(t(w)%*%w)
> cov
[,1]
[,2]
[,3]
[1,] 0.06785714 0.00000000 0.000000000
[2,] 0.00000000 0.06785714 0.000000000
[3,] 0.00000000 0.00000000 0.001357143
p
Then, a 95% condence interval for2 is: ^2 t:975;7 :06785714 = [ 2:816; 1:584]
T
T
(ii) The least
p T squares estimator of the mean yield is Y^ = c ^, where c = (1; 1; 20). A 95% condence interval
^
is Y t:975;7 c cov c = [43:45; 47:35]
; SSE
) = (0:297; 2:811)
(h) A 95% CI for error variance is: ( SSE
2
2
7;:975 7;:025
> SSE/(qchisq(.975,7))
[1] 0.2966384
> SSE/(qchisq(.025,7))
[1] 2.810868
6. (a) By letting W = X , we can get
1 = + 1 +2 2 + 100;
2 = 1 2 2 ;
3 = : 2
3
1 1=2 1=2 100
Therefore, = 4 0 1=2 1=2 0 5 .
0 0
0
1
8
Now consider how four parameters can be expressed in terms of three parameters. Then,
2
6
=6
4
.
1 0
0 1
0 1
0 0
3
100
0 77 0 5
0
Here we see that 2 = 1 and 2 = 2 . Consequently, the restriction that is being imposed is 1 + 2 = 0.
Note that 1 + 2 is not an estimable quantity.
(b) Dene 1 + 2 = [0 1 1 0] = cT . In this case, cT is estimable if and only if cT d = 0 where dT = w[ 1110]
for w 6= 0. However, cT d = 2w 6= 0. Therefore 1 + 2 is not estimable.
7.This is not a reparameterization of the models in problem 4 and 5. Denote this model matrix as X, and denote
the model matrix in problem 4 as W. Each column of X can be written as a linear combination of the columns of W
(note they share the same rst 3 columns, and the 4th column of X is the sum of the last 2 columns of W). However,
not all columns of W can be written as a linear combinations of the columns of X because rank(W) is larger than
rank(X) (check the 4th and 5th columns of W). Hence, the columns of X do not span the same model space as the
columns of W and these models
3 not reparameterizations of each other.
2 are
8.H0 : C = (0; 1;
6
1; 0) 64
1 7
7
2 5 = 0
Ha : C = 0:5, and the power of the test is:
power = P r(F1;10n 3 (Æ2 ) > F1;10n 3;:95 )
where Æ2 = 12 0:5[C (X T X ) C T ] 1 0:5
In this case,
2
10n 5n 5n
6 5n
5n
0
T
6
X X = 4 5n
0
5n
500n 250n 250n
C (X T X ) C T = 0:4=n and Æ2 = 0:5 (1=0:4) 0:5 = 5n=8
Choose n such that
:90 = power = P r(F1;10n 3 (5n=8) > F1;10n 3;:95 )
n=18
> n<-1
> powerfun<-function(n){
1-pf(qf(.95,1,10*n-3),1,10*n-3,5*n/8)
}
> while (powerfun(n)<.90){
n<-n+1
powerfun(n)
}
> n
[,1]
[1,] 18
9
500n
250n
250n
500n
3
7
7
5
Download