STAT 510 Homework 8 Solutions Spring 2016

advertisement
STAT 510
Homework 8 Solutions
Spring 2016
1. [40pts]
(a) [12pts] A linear-mixed effects model for the overall quality score is
yijk = µ + αi + uij + ijk ,
where
• αi is the fixed effect corresponding to temperature level i = 1, 2, 3,
• uij is the random effect corresponding to cooler j = 1, 2, 3, 4 at temperature level i,
• ijk is the random error for beef cut k = 1, 2 in cooler j at temperature level i, and
iid
iid
• uij ∼ N (0, σu2 ) independent of ijk ∼ N (0, σ2 ).
In matrix form, this model is
y = Xβ + Zu + ,
where
y = (y111 , y112 , y121 , . . . , y142 , y211 , . . . , y342 )0 ,
X = (124×1 , I3×3 ⊗ 18×1 ),
β = (µ, α1 , α2 , α3 )0 ,
Z = (I12×12 ⊗ 12×1 ),
u = (u11 , u12 , . . . , u34 )0 ,
= (111 , 112 , 121 , . . . , 142 , 211 , . . . , 342 )0 , and
2
u
012×1
σu I12×12 012×24
•
∼N
,
.
024×1
024×12 σ2 I24×24
•
•
•
•
•
•
(b) [12pts]
Source
temperature
DF
3–1=2
Sums of Squares
3 X
4 X
2
X
(ȳi − ȳ )2
i=1 j=1 k=1
cooler(temp)
(4–1)(3)=9
3 X
4 X
2
X
3
(ȳij − ȳi )2
i=1 j=1 k=1
cut(cooler,temp) (2–1)(3)(4)=12
c. total
24–1=23
3 X
4 X
2
X
Mean Squares
3
8X
(ȳi − ȳ )2
2 i=1
(ȳijk − ȳij )
i=1 j=1 k=1
3 X
4 X
2
X
(ȳijk − ȳ )2
i=1 j=1 k=1
1
i=1
4
2 XX
(ȳij − ȳi )2
9 i=1 j=1
3
2
Expected Mean Squares
3
X
2
2
σ + 2σu + 4
(αi − ᾱ )2
4
σ2 + 2σu2
2
1 XXX
(ȳijk − ȳij )2
12 i=1 j=1 k=1
σ2
(c) [10pts] A test of H0 : α1 − α2 = 0 can be based on
ȳ1·· − ȳ2·· − 0
ȳ1·· − ȳ2··
ȳ1·· − ȳ2··
t= q
=r =q P P
.
3
4
2M Scooler(temp)
P3 P4
1
2
1 2
2
i=1
j=1 (ȳij − ȳi )
18
4·2
i=1
j=1 (ȳij − ȳi )
4 9
The numerator should be obvious, but why use
that since the uij and eijk are all independent,
2M Scooler(temp)
4·2
in the denominator? Notice
Var(ȳ1·· − ȳ2·· ) = Var(ȳ1·· ) + Var(ȳ2·· ) − 2 Cov(ȳ1·· , ȳ2·· )
= Var (µ + α1 + ū1 + ¯1 ) + Var (µ + α2 + ū2 + ¯2 )
− 2 Cov (µ + α1 + ū1 + ¯1 , µ + α2 + ū2 + ¯2 )
= Var (ū1 + ¯1 ) + Var (ū2 + ¯2 ) − 2 Cov (ū1 + ¯1 , ū2 + ¯2 )
= Var (ū1 ) + Var (¯1 ) + Var (ū2 ) + Var (¯2 )
σu2
σ2
σu2
σ2
=
+
+
+
4
2·4
4
2·4
2 (σ2 + 2σu2 )
=
4·2
2EM Scooler(temp)
=
.
4·2
(d) [1pt] The degrees of freedom are 9, since the denominator is based on M Scooler(temp) .
(e) [5pts] The noncentrality parameter is
α1 − α2 − 0
2(α1 − α2 )
q
=p
.
2
2
2(σ +2σu )
σ2 + 2σu2
4·2
2. [25pts]
(a) [5pts] The covariance between the heights of two plants (i.e., genotypes k = 1, 2) on the
same table (i.e., watering level j and greenhouse i) is
Cov(yij1 , yij2 ) = Cov(µ + gi + ωj + tij + γ1 + φj1 + eij1 , µ + gi + ωj + tij + γ2 + φj2 + eij2 )
= Cov(gi + tij + eij1 , gi + tij + eij2 )
dropping fixed effects
= Cov(gi , gi ) + Cov(tij , tij )
since gi , tij , eijk are all independent
2
2
= σg + σt .
The variance of any single observation is
Var(yijk ) = Var(µ + gi + ωj + tij + γk + φjk + eijk )
= Cov(gi + tij + eijk , gi + tij + eijk )
dropping fixed effects
= Cov(gi , gi ) + Cov(tij , tij ) + Cov(eijk , eijk )
since gi , tij , eijk are all independent
2
2
2
= σg + σt + σe .
2
Hence, the correlation is
Cov(yij1 , yij2 )
Corr(yij1 , yij2 ) = p
Var(yij1 ) Var(yij2 )
σg2 + σt2
= 2
.
σg + σt2 + σe2
(b) [5pts] If there are no watering level main effects, the fixed effects will be the same for
each watering level j when averaged across the other factors (i.e., averaged over i and
k). Written in terms of the model parameters, µ + ωj + γ̄· + φ̄j· would be equal for all
j. This happens if and only if ωj + φ̄j· is equal for all j, so the null hypothesis of no
watering level main effects is
H0 : ω1 + φ̄1· = ω2 + φ̄2· = ω3 + φ̄3·
Comments: Note that H0 : ω1 = ω2 = ω3 is not the null hypothesis of no watering level
main effects. Even if ω1 = ω2 = ω3 , there could still be main effects from the interaction
terms.
(c) [10pts] Let
•
•
•
•
β = (µ, ω1 , ω2 , ω3 , γ1 , γ2 , φ11 , φ12 , φ21 , φ22 , φ31 , φ32 )0 ,
X = (124×1 , 14×1 ⊗ I3×3 ⊗ 12×1 , 112×1 ⊗ I2×2 , 14×1 ⊗ I6×6 ),
u = (g1 , g2 , g3 , g4 , t11 , t12 , t13 , t21 , . . . , t43 )0 ,
Z = (I4×4 ⊗ 16×1 , I12×12 ⊗ 12×1 ).
(d) [5pts] This is a split-plot experiment, where block = GH, whole-plot factor = WL, and
split-plot factor = GENO. We can separate the ANOVA table into whole- and split-plot
parts, which has the skeleton
Source
DF
GH
3
WH
2
WP Error ( = GH:WL)
6
GENO
1
WL:GENO
2
SP Error ( = GH:GENO + GH:WL:GENO)
3+6=9
c. total
(4)(3)(2) - 1 = 23
i. [1pt] The numerator should be based on WL, which is the whole-plot factor. Hence,
the denominator should be based on the whole-plot error, GH:WL. Therefore,
F =
SSWL /dfWL
321.8/2
=
= 8.29.
SSGH:WL /dfGH:WL
116.4/6
3
ii. [2pts] The numerator should be based on GENO, which is the split-plot factor.
Hence, the denominator should be based on the split-plot error, GH:GENO +
GH:WL:GENO. Therefore,
SSGENO /dfGENO
(SSGH:GENO + SSGH:WL:GENO )/(dfGH:GENO + dfGH:WL:GENO )
2.5/1
=
(11.7 + 14.5)/(3 + 6)
= 0.859.
F =
iii. [2pts] The numerator should be based on WL:GENO, which is falls under the splitplot part of the ANOVA table. Hence, the denominator should be based on the
split-plot error, GH:GENO + GH:WL:GENO. Therefore,
SSWL:GENO /dfWL:GENO
(SSGH:GENO + SSGH:WL:GENO )/(dfGH:GENO + dfGH:WL:GENO )
75.1/2
=
(11.7 + 14.5)/(3 + 6)
= 12.90.
F =
3. [35pts]
(a) [5pts] The true mean responses and corresponding levels for genotype and fertilizer are
shown below:
>
>
>
>
>
>
>
>
>
block=factor(rep(1:4,each=12))
geno=factor(rep(rep(1:3,each=4),4))
x=rep(seq(0,150,by=50),12)
fert=factor(x)
X=model.matrix(~geno+x+I(x^2)+geno:x)
beta=c(125,15,-10,.4,-0.0015,0,.2)
d <- data.frame(fert = x, geno, mean = X %*% beta)
mu <- xtabs(mean ~ geno + fert, data = unique(d))
mu
fert
geno
0
50
100
150
1 125.00 141.25 150.00 151.25
2 140.00 156.25 165.00 166.25
3 115.00 141.25 160.00 171.25
(b) [5pts] No, the null hypothesis of no genotype main effects is not true since µ̄i is not the
same for all i:
> rowMeans(mu)
1
2
3
141.875 156.875 146.875
4
(c) [5pts] No, the null hypothesis of no fertilizer main effects is not true since µ̄j is not the
same for all j:
> colMeans(mu)
0
50
100
150
126.6667 146.2500 158.3333 162.9167
(d) [5pts] No, the null hypothesis of no genotype × fertilizer interactions is not true, since
(µ11 − µ13 ) − (µ31 − µ33 ) 6= 0
(µ22 − µ23 ) − (µ32 − µ33 ) 6= 0.
and
> mu[1,1] - mu[1,3] - mu[3,1] + mu[3,3]
[1] 20
> mu[2,2] - mu[2,3] - mu[3,2] + mu[3,3]
[1] 10
(e) [5pts]
Genotype 1: f (x) = 125 + 0.4x − 0.0015x2
Genotype 2: f (x) = 125 + 15 + 0.4x − 0.0015x2 + 0x = 140 + 0.4x − 0.0015x2
Genotype 3: f (x) = 125 − 10 + 0.4x − 0.0015x2 + 0.2x = 115 + 0.6x − 0.0015x2
140
120
Genotype 1
Genotype 2
Genotype 3
100
True Mean Response
160
180
The plot below was produced by the R code that follows:
0
50
100
Fertilizer Level
5
150
g1 <- function(x) 125 + 0.4*x - 0.0015*x^2
g2 <- function(x) 140 + 0.4*x - 0.0015*x^2
g3 <- function(x) 115 + 0.6*x - 0.0015*x^2
curve(g1(x), xlim = c(-10, 160), ylim = c(100, 180), lwd = 2,
xlab = ’Fertilizer Amount’, ylab = ’True Mean Response’)
curve(g2(x), add = TRUE, col = ’blue’, lwd = 2)
curve(g3(x), add = TRUE, col = ’orange’, lwd = 2)
legend(100, 120, c(’Genotype 1’, ’Genotype 2’, ’Genotype 3’),
col = c(’black’, ’blue’, ’orange’), lwd = rep(2,3))
(f) [5pts] By slide 41 of set 15, an approximate 95% confidence interval for µ11 − µ21 is
q
d 11 − ȳ21 ),
ȳ11 − ȳ21 ± td,0.975 Var(ȳ
where td,0.975 denotes the 0.975 quantile of a t distribution with d degrees of freedom
computed by Cochran-Satterthwaite and
d 11 − ȳ21 ) = 2 M SBlk×Geno + 2(4 − 1) M SError = 1 M SBlk×Geno + 3 M SError .
Var(ȳ
4·4
4·4
8
8
Using the R code below,
ȳ11 − ȳ21 = −22.5,
d 11 − ȳ21 ) = 53.50,
Var(ȳ
d = 11.15,
and an approximate 95% confidence interval for µ11 − µ21 is
(−38.57, −6.43).
This agrees with the interval computed by SAS on page 8 of slide set 17 (titled ’geno 1
- geno 2 with no fertilizer’).
>
>
>
>
>
>
>
>
>
Z1 <- model.matrix(~0+block)
Z2 <- model.matrix(~0+geno:block)
Z <- cbind(Z1,Z2)
set.seed(532)
u <- c(rnorm(4,0,6),rnorm(12,0,7))
e <- rnorm(48,0,6)
y <- round(X%*%beta+Z%*%u+e,1)
dat <- data.frame(block,geno,fert,y)
est <- mean(subset(dat, geno == ’1’ & fert == ’0’)$y)
- mean(subset(dat, geno == ’2’ & fert == ’0’)$y)
> est
[1] -22.5
> o <- lm(y~block+geno+block:geno+fert+geno:fert, data = dat)
> MS <- anova(o)$’Mean Sq’
> df <- anova(o)$Df
> var <- MS[4] / 8 + 3 * MS[6] / 8
> var
6
[1] 53.50212
> d <- var^2 / ( (MS[4]/8)^2/df[4] + (3 * MS[6]/8)^2/df[6] )
> d
[1] 11.15121
> est + c(-1,1) * qt(0.975, d) * sqrt(var)
[1] -38.572543 -6.427457
(g) [5pts] The true value is −15, which is contained within the interval computed in part
(f).
> mu[1,1] - mu[2,1]
[1] -15
7
Download