Stat 401 A – HW 8 answers.

advertisement
Stat 401 A – HW 8 answers.
1) Fatty Acids, 1 pt each part
a) 3 fatty acids have unadjusted p-values less than 5%
b) There are no discoveries using an fdr of 10%
c) There is evidence of a difference in Stigmasterol concentration between the wild type and knockout
mutant (p < 0.05).
d) No evidence of a difference in Stigmasterol at an fdr of 10%.
Note: The exact wording of parts c and d is not relevant. The key is that c) should be answered by
evaluating the unadjusted p-value, because Stigmasterol was identified prior to collecting the data as
the primary response. d) should answered by the evaluating the fdr adjusted value, because the
scenario for d, Stigmasterol is just one of 115 fatty acids.
Note from PMD: Notice the tremendous effect of adjusting for multiplicity in this problem, and that is
with a relatively small number of compounds under study. In lecture, I said that fdr is not as stringent as
something that maintains and experiment-wise error rate, like a Bonferroni correction. The unadjusted
p-value for Stigmasterol is 0.0202. Using a Bonferroni correction gives you an adjusted p-value of 2.3,
which is a meaningless probability. The fdr adjusted p-value is something like 0.6.
2) Lettuce yield (continued)
a) (4 pts, 1 for assumptions, 1 for each evaluation) The assumptions are:
independence: appropriate, because one observations (plot yield) per experimental unit
equal variance: suspect, larger spreads for larger N (or larger mean yield), see residual plot
normality: multiple interpretations possible. If suspect, should explain why (perhaps, one especially
large obs in each of the two largest N treatments).
Note: I find it easier to look at a residual vs predicted value plot, which looks like the plot on the left
below. A boxplot of the observations (plot on the right below) requires less work to produce but may be
harder to interpret because you’re comparing spreads of groups with different means and don’t see
individual observations
*NOTE: Most students forgot to check for normality and independence and got marked off 2 pts.
80
200
60
150
Yield
40
Yield
20
0
100
-20
110
120
130
140
150
Predicted yield
160
0
50
100
150
Fertilizer
200
-0.2
4.6
-0.1
4.8
0.0
Yield
0.1
Log Yield
5.0
0.2
5.2
0.3
5.4
0.4
b) 1 pt Equal variances is still suspect. Clearest to see if you look at residual plot. Acceptable to look at
the boxplot and conclude no problem. Here are the plots:
4.7
4.8
4.9
Predicted log yield
5.0
0
50
100
150
Fertilizer
c) 1 pt. Yes, Approximately equal variances after reciprocal transformation Here are my plots:
200
Reciprocal Yield
0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.011
0.001
-0.002
-0.001
Yield
0.000
0.0065
0.0075
0.0085
Predicted reciprocal yield
0.0095
0
50
100
Fertilizer
150
200
d) 1 pt. No pairs are significantly different after Tukey adjustment for all pairwise comparisons.
e) 1 pt. 0 – 150 (adj p = 0.038) and 0 – 200 (adj p = 0.033) are significantly different when analyzing
reciprocal yield with Tukey adjustment.
Notes: The point of this question is to demonstrate the role of equal variances in “after the ANOVA”
comparisons. I rarely use the reciprocal transformation because it is very hard to interpret the estimate
differences. It is best used only for testing.
5
10
nai
15
20
25
3) Music and brain activity (1 pt per part)
a) 1 pt. Yes, a linear regression will be an appropriate summary of the relationship between years of
playing and nai. The relationship between the two variables looks reasonably linear.
0
5
10
years
15
b) 1 pt. The estimated intercept is 8.39; the estimated slope is 0.997.
c) 1 pt. Each additional year of playing is associated with 0.997 additional units of the neuronal activity
index.
Note: This study is observational. Can’t make causal conclusions. That means “Each additional year of
playing increases the neuronal activity index by 0.997 units” is wrong. “X increases Y” means that X
causes the increase in Y.
*NOTE: Almost all students missed this pt because of wording.
d) 1 pt. The estimated mean activity for people who have never played is 8.38.
Note: The quantity you want here is the intercept. You could predict for X=0, but you don’t need to do
that.
Download