Statistics 101 - Take Home Lab #11

advertisement
Statistics 101 - Take Home Lab #11
Due Tuesday, April 27
In Lab #11, you studied the importance of randomization in determining the statistical significance of the difference between mean yields of two varieties of corn. In lab, you were given “THE
TRUTH” and asked to randomly assign two varieties of corn to 36 plots of land and then conduct
a two-sample t-test to determine if the mean yields of the two varieties of corn were equal. This
take home laboratory assignment extends what you did in lab by using the computer to replicate
many (100 to be exact) trials of this randomized experiment. Using the computer in this manner
allows us to look more closely at hypothesis testing.
To begin, go to the course webpage at
http://www.public.iastate.edu/∼wrstephe/stat101L.html
Under Lab Material, right click on the link JMP Script Difference=12. Select Save Link
Target As (or Save Target As if you are using Internet Explorer). Make sure the name of the
file is yieldsmean12.jsl. Save this file to either the computer’s hard drive or a disk.
Start JMP. Open the yieldsmean12.jsl file by clicking on the Open Script button in the JMP
Starter window. Find the location of the file, select the file and then click Open. You should now
have a window open in JMP called yieldsmean12.
This file (yieldsmean12) is a JMP script. The codes in the file will instruct JMP to do the same
randomized experiment you completed in lab on the yields of corn varieties A and B. For all 36
plots, the difference between yields A and B in this script is 12 bushels, the same as the difference
in “THE TRUTH” you were given in class. To make JMP run the randomized experiment 100
times, select Edit → Run Script. The script will take a few moments to run.
Once the script is finished, you will have a window open with data from the 100 trials of this
experiment. Each row in the data table is one of the 100 trials. The columns in the data table are
• Mean Difference = the difference between the sample means of yields A and B.
• Standard Error = the standard error of the sample mean difference.
• Lower CI = the lower endpoint of a 95% confidence interval for the difference between the
mean yields of varieties A and B.
• Upper CI = the upper endpoint of a 95% confidence interval for the difference between the
mean yields of varieties A and B.
• t value = the two sample t-test statistic for testing whether the mean yields of A and B are
equal.
• d.f. = The degrees of freedom for the two-sample t-test. This value is calculated using the
formula found on page 536 of your textbook.
• p-value = The p-value for testing H o : µA = µB vs. Ha : µA 6= µB .
1
Use JMP to get histograms of the 100 Mean Difference, t value and p-values from this data table.
For the Mean Difference values, add the Normal Quantile Plot to the output. For the p-values, add
the Stem and Leaf Plot to the output. Use this information to answer the following questions.
1. Describe the shape, center and spread of the Mean Difference histogram.
2. According to “THE TRUTH”, what value should your histogram be centered around? Is the
center of your histogram close to this value?
3. Describe the normal quantile plot for the Mean Difference. Does the distribution of the
difference between the two sample means appear to be normal? Explain your answer.
4. If the null hypothesis of equal means were really true, what value should the t value histogram
be centered around? Do any of the t values come close to this value?
5. What is the maximum p-value of the 100 trials? Of the 100 trials, how many have p-values
of less than 0.05?
In a hypothesis test, if the null hypothesis is really true, we want to reject the null hypothesis a
small percentage of the time. However, if the null hypothesis is false, we want to reject the null
hypothesis a large percentage of the time. In this example, the null hypothesis is false; the true
difference between (mean) yield A and B is 12 bushels. We therefore want the two-sample t-test
to detect this difference and reject the null hypothesis of equal mean yields a large percentage of
the time. This is called the power of a hypothesis test. Your answer to problem #5 above is an
estimate of the power of the test when the difference between yields is 12 bushels. The power of
this particular test depends on the difference between the two yields. If you change the difference
between yields A and B, the power of the two-sample t-test to detect this difference will also change.
2
Download