Stat 401G Lab 2 Solution Last week's lab looked at the data on nonstructural carbohydrates (NSC) concentrations (mg/g) in sapling stems in moist semi-evergreen and dry deciduous tropical forests in Bolivia. The data are available on the course Web site: http://www.public.iastate.edu/~wrstephe/stat401.html Use JMP to look at the comparison of moist to dry forests. It is not sufficient to simply hand in JMP output. You must answer the questions completely. Turn in the JMP output attached at the end of your written (word processed) solutions. 1. Two-sample problem. a) Summarize the NSC concentrations graphically and numerically for each type of forest. How do dry forests compare to moist forests in terms of NSC concentrations? Distributions Forest=Moist Distributions Forest=Dry 14 12 12 6 0 25 50 75 100 125 8 Count 8 6 4 4 2 2 150 0 25 NSC maximum quartile median quartile minimum Mean Std Dev N 75 100 125 150 NSC Dry Forests 100.0% 75.0% 50.0% 25.0% 0.0% 50 Count 10 10 Moist Forests 92.000 62.000 52.000 42.000 19.000 52.965517 16.242543 29 100.0% 75.0% 50.0% 25.0% 0.0% maximum quartile median quartile minimum Mean Std Dev N 137.00 94.75 69.00 46.00 22.00 71.375 31.235817 40 1 Dry forests tend to have lower NSC concentrations than moist forests. This is shown in the lower means (dry: 53.0 mg/g, moist: 71.4 mg/g) and lower medians (dry: 52 mg/g, moist: 69 mg/g). The distribution of dry forest NSC concentrations is fairly symmetric and mounded in the middle while the distribution of moist forest NSC concentrations is skewed toward higher values (skewed right). b) Do the dry forests differ significantly from the moist forests in terms of average NSC concentration? Perform the appropriate test of hypothesis. Use α=0.05. H 0 : dry moist H A : dry moist t 2.898 P value 0.0051 * Because the P-value is so small, we reject the null hypothesis. The difference between the sample mean NSC concentration for the dry forest trees/shrubs and that for the moist forest trees/shrubs is statistically significant. c) Report the value of a 98% confidence interval for the difference between the mean NSC concentration for dry and moist forests. Give an interpretation of this interval use the interval to test to see if the difference in mean NSC concentrations between dry and moist forests is statistically significant. The 98% confidence interval for the difference between the mean NSC concentration for dry and moist forests goes from 3.27 mg/g to 33.55 mg/g. On average, trees/shrubs in moist forests have from 3.27 mg/g to 33.55 mg/g higher concentrations of NSC than those in dry forests. Because the confidence interval contains only positive values, it does not contain zero, this indicates that there is a statistically significant difference between the mean NSC concentrations for the two types of forest. 2 3 .99 2 .95 .90 1 .75 .50 0 Normal Quantile Plot d) Use JMP to create the residuals for the two types of forest, dry and moist. Use Analyze + Distribution to create JMP output that looks at the distribution (histogram, box plot and Normal Quantile Plot) of the resulting 69 residuals. .25 .10 .05 -1 -2 .01 -3 20 10 Count 15 5 -60 -40 -20 0 20 40 60 80 e) Refer to the histogram of residuals and indicate what you see and what this tells you about the condition that the random errors are normally distributed. The histogram of residuals is mounded between –20 and 0. There is a slight skew toward larger values (skewed right). The skew is not as pronounced as what was seen in the one-sample problem (Lab 1). With such a slight skew, the random errors might still be normally distributed. 3 f) Refer to the box plot of residuals. Are there any possible outliers? If so, what are the corresponding NSC concentrations, species and type of forest? What does the box plot tell you about the random errors being identically distributed? There are no apparent outliers displayed in the box plot. This indicates that the condition that random errors should be identically distributed is most likely met. g) Refer to the Normal Quantile Plot of residuals and indicate what you see and what this tells you about the condition that the random errors are normally distributed? Although there is a slight curve, the residuals basically follow the diagonal line representing a normal model. This indicates that random errors could be normally distributed. h) Comment on whether or not the data are consistent with the condition that both populations (dry and moist) have the same standard deviation. This will require further analysis with JMP 80 60 Residual (two-sample) 40 20 0 -20 -40 -60 Dry Moist Forest The plot of residuals for the two types of forest shows that the dry forest residuals are less spread out than those for the moist forest. The values for sample standard deviations support this but the difference is probably not great enough for us to worry. Dry Forest Std Dev = 16.243 Moist Forest Std Dev = 31.236 4 i) Summarize what you have learned about the conditions necessary for statistical analysis of data from your analysis of residuals. The residuals for the two-sample problem are a little bit better behaved than for the one-sample problem (Lab 1). The histogram and normal quantile plot indicate that random errors could be normally distributed. Additionally, because there are no outliers the condition of identically distributed errors is most likely met. Although the sample standard deviations are different (16.2 mg/g for dry forests and 31.2 mg/g for moist forests) this is probably not a big enough difference to make us change our conclusion that moist forests have a population mean NSC different from that of dry forests. 5 2. On a separate piece of paper write a brief summary of what you have learned about NSC concentrations in tree/shrubs in moist and dry tropical forests. Dry forests tend to have lower NSC concentrations than moist forests. This is shown in the lower sample mean for dry forests, 53.0 mg/g, compared to that of moist forests, 71.4 mg/g. There is also more variation in moist forests compared to dry forests. This is supported by the larger interquartile range of 48.75 mg/g for moist forests compared to 20 mg/g for dry forests. Also the sample standard deviation for moist forests is 31.2 mg/g compared to 16.2 mg/g for dry forests. The difference in sample mean NSC concentrations, 18.4 mg/g, is statistically significant (P-value = 0.0051). With 98% confidence the population mean NSC concentration for moist forests is from 3.27 mg/g to 33.55 mg/g higher than the population mean NSC concentration for dry forests. Although the conditions for statistical inference may not be met perfectly, slight right skew in the shape of the distribution of residuals and the difference in sample standard deviations, the statement that the population mean NSC for moist forests is different from that of dry forests is still supported by the data. 6