upload:errors%20in%20hypothesis%20testing

advertisement
Fathom Demo: Errors in Hypothesis Testing:
A correct decision is made when you favor the null hypothesis and the null hypothesis is true, or you favor
the alternative hypothesis (reject the null) and the alternative hypothesis is true. This is a difficult concept to
understand because there are two means to think about. One is the null hypothesis mean and the other is the
true mean. If these are the same, then certainly you favor the null hypothesis. If they are not the same, then
the question becomes whether to still accept the null hypothesis mean. If the true mean is too far from the
null hypothesis mean, a correct decision would be to favor the alternative hypothesis. Thus when the true
mean does not equal the hypothesis mean, an incorrect decision can be made.
This demo will simulate 100 samples of all the same sample size and plot the sample means. This will allow
us to observe the number of correct or incorrect decisions made from possible samples gathered.
Let’s look at the following scenario:
Suppose you own the “Sod Store”, you sell rolls of turf to the public. The mean weight of a roll of turf
varies normally with μ = 8 lbs and σ = 2 lbs. It is important to maintain the mean weight, because when the
rolls of turf become too heavy, the consumer cannot handle them easily. When the truckload arrives, you
take a random sample of 4 rolls and measure their weights. You reject the shipment of turf when the mean
weight exceeds 8 lbs at the 5% level of significance. You have not rejected any shipments lately, but the
customers are beginning to complain that their turf is hard to handle. You wonder if you are doing
something wrong.
Ho: μ = 8
Ha: μ > 8
Part I.
Assume the mean weight of all the rolls of sod in the truck equals 8 lbs.
Press the Collect More Measures button in the open collection.
Fathom will generate 100 possible sample means of sample size of 4
rolls of turf. Notice the shape, center and spread of the sampling distribution.
Solve the expression to the right for x to find the mean
weight of a random sample of size 4 at which we reject
the truckload (reject the null hypothesis) at the 0.05
significance level. Any sample of size 4 that has a mean
weight of 9.6449 lbs or more will be rejected.
z 
x 

1.645 
x 8
2/
4
n
x  9.6449
Create a line at the critical point, select the dot plot. Go to plot value from the graph menu, and enter
Normal Quantile (.95, 8, 2/ 4 ), Apply, OK.
Dot Plot
Measures f rom sample turf
4
5
normalQuantile
6
7
8
9
samplemeanw eight
10
11
= 9.64485
To count the number of samples that are greater than 9.6449, generate a new summary table by selecting
summary table from the insert menu. Drag the “sample mean weight” from the “measures from sample
turf” box to the down arrow in the new summary table. Double click on S1= mean( ) and type
count(>9.6449), click Apply, and then OK.
Measures from sample turf
Measures from sample turf
samplemeanw ...
1=
8.59443
2
9.10778
3
6.02558
4
7.88041
5
9.14423
<new >
samplemeanw eight 5
S1 = count ?
Summary Table
Press collect more measures and observe the number of samples that would be rejected. If these samples
were taken then a wrong decision would be made. The truck would be rejected, but the true mean of the rolls
of turf is 8 lbs. These samples would represent type I error.
Questions.
A.) If a mean weight of the sod in the shipment was really 8 lbs and you rejected the shipment, have you
made an incorrect or correct decision?
B.) If a sample of size 4 had a mean of 9.1 lbs. when the true mean equals 8 lbs., would you have
evidence to reject the null hypothesis that μ = 8? Why?
C.) If a sample of size 4 had a mean of 10 lbs. when the true mean equals 8 lbs., would you have
evidence to reject the null hypothesis that μ = 8? Why?
D.) On average, what percent of the time did type I error occur from the 100 samples generated?
Part II.
Now change  to 8.5 by moving the slider from 8 to 8.5. The null hypothesis is still Ho:  = 8, but the true
mean of the rolls of turf in the truck now equals 8.5. To make a correct decision you should reject the null
hypothesis. Generate 100 samples of size four from the truck with a mean weight of 8.5 lbs. Do this by
clicking collect more measures. The samples to the right of the red line reject the null hypothesis hence the
right decision to not accept the truck. These samples would represent the POWER of the hypothesis test.
From your 100 samples observe the number of samples that represent the power of the test.
Change  to 9 lbs. and generate 100 more samples and observe the percent of samples that would make a
correct decision to reject the null hypothesis.
Questions.
E.) Did the percent of the samples that would reject the null hypothesis increase or decrease as the true
population mean was farther from the hypothesis mean?
F.) If the true population mean weight of the rolls was in fact 10 lbs. (of course you do not know this)
and you sample 4 rolls of turf and reject the shipment, did you make a correct decision?
G.) What if the true population mean weight of the rolls,  , was in fact 10 lbs. rather than the expected 8
lbs., would you reject every shipment?
Download