Stat 401 A – HW 8 answers. 1) Fatty Acids, 1 pt each part a) 3 fatty acids have unadjusted p-values less than 5% b) There are no discoveries using an fdr of 10% c) There is evidence of a difference in Stigmasterol concentration between the wild type and knockout mutant (p < 0.05). d) No evidence of a difference in Stigmasterol at an fdr of 10%. Note: The exact wording of parts c and d is not relevant. The key is that c) should be answered by evaluating the unadjusted p-value, because Stigmasterol was identified prior to collecting the data as the primary response. d) should answered by the evaluating the fdr adjusted value, because the scenario for d, Stigmasterol is just one of 115 fatty acids. Note from PMD: Notice the tremendous effect of adjusting for multiplicity in this problem, and that is with a relatively small number of compounds under study. In lecture, I said that fdr is not as stringent as something that maintains and experiment-wise error rate, like a Bonferroni correction. The unadjusted p-value for Stigmasterol is 0.0202. Using a Bonferroni correction gives you an adjusted p-value of 2.3, which is a meaningless probability. The fdr adjusted p-value is something like 0.6. 2) Lettuce yield (continued) a) (4 pts, 1 for assumptions, 1 for each evaluation) The assumptions are: independence: appropriate, because one observations (plot yield) per experimental unit equal variance: suspect, larger spreads for larger N (or larger mean yield), see residual plot normality: multiple interpretations possible. If suspect, should explain why (perhaps, one especially large obs in each of the two largest N treatments). Note: I find it easier to look at a residual vs predicted value plot, which looks like the plot on the left below. A boxplot of the observations (plot on the right below) requires less work to produce but may be harder to interpret because you’re comparing spreads of groups with different means and don’t see individual observations *NOTE: Most students forgot to check for normality and independence and got marked off 2 pts. 80 200 60 150 Yield 40 Yield 20 0 100 -20 110 120 130 140 150 Predicted yield 160 0 50 100 150 Fertilizer 200 -0.2 4.6 -0.1 4.8 0.0 Yield 0.1 Log Yield 5.0 0.2 5.2 0.3 5.4 0.4 b) 1 pt Equal variances is still suspect. Clearest to see if you look at residual plot. Acceptable to look at the boxplot and conclude no problem. Here are the plots: 4.7 4.8 4.9 Predicted log yield 5.0 0 50 100 150 Fertilizer c) 1 pt. Yes, Approximately equal variances after reciprocal transformation Here are my plots: 200 Reciprocal Yield 0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.011 0.001 -0.002 -0.001 Yield 0.000 0.0065 0.0075 0.0085 Predicted reciprocal yield 0.0095 0 50 100 Fertilizer 150 200 d) 1 pt. No pairs are significantly different after Tukey adjustment for all pairwise comparisons. e) 1 pt. 0 – 150 (adj p = 0.038) and 0 – 200 (adj p = 0.033) are significantly different when analyzing reciprocal yield with Tukey adjustment. Notes: The point of this question is to demonstrate the role of equal variances in “after the ANOVA” comparisons. I rarely use the reciprocal transformation because it is very hard to interpret the estimate differences. It is best used only for testing. 5 10 nai 15 20 25 3) Music and brain activity (1 pt per part) a) 1 pt. Yes, a linear regression will be an appropriate summary of the relationship between years of playing and nai. The relationship between the two variables looks reasonably linear. 0 5 10 years 15 b) 1 pt. The estimated intercept is 8.39; the estimated slope is 0.997. c) 1 pt. Each additional year of playing is associated with 0.997 additional units of the neuronal activity index. Note: This study is observational. Can’t make causal conclusions. That means “Each additional year of playing increases the neuronal activity index by 0.997 units” is wrong. “X increases Y” means that X causes the increase in Y. *NOTE: Almost all students missed this pt because of wording. d) 1 pt. The estimated mean activity for people who have never played is 8.38. Note: The quantity you want here is the intercept. You could predict for X=0, but you don’t need to do that.