Fathom Demo: Errors in Hypothesis Testing: A correct decision is made when you favor the null hypothesis and the null hypothesis is true, or you favor the alternative hypothesis (reject the null) and the alternative hypothesis is true. This is a difficult concept to understand because there are two means to think about. One is the null hypothesis mean and the other is the true mean. If these are the same, then certainly you favor the null hypothesis. If they are not the same, then the question becomes whether to still accept the null hypothesis mean. If the true mean is too far from the null hypothesis mean, a correct decision would be to favor the alternative hypothesis. Thus when the true mean does not equal the hypothesis mean, an incorrect decision can be made. This demo will simulate 100 samples of all the same sample size and plot the sample means. This will allow us to observe the number of correct or incorrect decisions made from possible samples gathered. Let’s look at the following scenario: Suppose you own the “Sod Store”, you sell rolls of turf to the public. The mean weight of a roll of turf varies normally with μ = 8 lbs and σ = 2 lbs. It is important to maintain the mean weight, because when the rolls of turf become too heavy, the consumer cannot handle them easily. When the truckload arrives, you take a random sample of 4 rolls and measure their weights. You reject the shipment of turf when the mean weight exceeds 8 lbs at the 5% level of significance. You have not rejected any shipments lately, but the customers are beginning to complain that their turf is hard to handle. You wonder if you are doing something wrong. Ho: μ = 8 Ha: μ > 8 Part I. Assume the mean weight of all the rolls of sod in the truck equals 8 lbs. Press the Collect More Measures button in the open collection. Fathom will generate 100 possible sample means of sample size of 4 rolls of turf. Notice the shape, center and spread of the sampling distribution. Solve the expression to the right for x to find the mean weight of a random sample of size 4 at which we reject the truckload (reject the null hypothesis) at the 0.05 significance level. Any sample of size 4 that has a mean weight of 9.6449 lbs or more will be rejected. z x 1.645 x 8 2/ 4 n x 9.6449 Create a line at the critical point, select the dot plot. Go to plot value from the graph menu, and enter Normal Quantile (.95, 8, 2/ 4 ), Apply, OK. Dot Plot Measures f rom sample turf 4 5 normalQuantile 6 7 8 9 samplemeanw eight 10 11 = 9.64485 To count the number of samples that are greater than 9.6449, generate a new summary table by selecting summary table from the insert menu. Drag the “sample mean weight” from the “measures from sample turf” box to the down arrow in the new summary table. Double click on S1= mean( ) and type count(>9.6449), click Apply, and then OK. Measures from sample turf Measures from sample turf samplemeanw ... 1= 8.59443 2 9.10778 3 6.02558 4 7.88041 5 9.14423 <new > samplemeanw eight 5 S1 = count ? Summary Table Press collect more measures and observe the number of samples that would be rejected. If these samples were taken then a wrong decision would be made. The truck would be rejected, but the true mean of the rolls of turf is 8 lbs. These samples would represent type I error. Questions. A.) If a mean weight of the sod in the shipment was really 8 lbs and you rejected the shipment, have you made an incorrect or correct decision? B.) If a sample of size 4 had a mean of 9.1 lbs. when the true mean equals 8 lbs., would you have evidence to reject the null hypothesis that μ = 8? Why? C.) If a sample of size 4 had a mean of 10 lbs. when the true mean equals 8 lbs., would you have evidence to reject the null hypothesis that μ = 8? Why? D.) On average, what percent of the time did type I error occur from the 100 samples generated? Part II. Now change to 8.5 by moving the slider from 8 to 8.5. The null hypothesis is still Ho: = 8, but the true mean of the rolls of turf in the truck now equals 8.5. To make a correct decision you should reject the null hypothesis. Generate 100 samples of size four from the truck with a mean weight of 8.5 lbs. Do this by clicking collect more measures. The samples to the right of the red line reject the null hypothesis hence the right decision to not accept the truck. These samples would represent the POWER of the hypothesis test. From your 100 samples observe the number of samples that represent the power of the test. Change to 9 lbs. and generate 100 more samples and observe the percent of samples that would make a correct decision to reject the null hypothesis. Questions. E.) Did the percent of the samples that would reject the null hypothesis increase or decrease as the true population mean was farther from the hypothesis mean? F.) If the true population mean weight of the rolls was in fact 10 lbs. (of course you do not know this) and you sample 4 rolls of turf and reject the shipment, did you make a correct decision? G.) What if the true population mean weight of the rolls, , was in fact 10 lbs. rather than the expected 8 lbs., would you reject every shipment?