252grass2-051 2/28/05 (Open this document in 'Page Layout' view!) Solution to Graded Assignment 2 1. Which of the following could be null hypotheses? Which could be an alternate hypothesis? Which could be neither? Why? If it is H 0 , what is H 1 ? If it is H 1 , what is H 0 ? (i) p 1.2, (ii) p 0.35, (iii) p .25 (iv) x 5.023 , (v) 5.023, (vi) s 5.023 , (vii) 5.023 , (viii) 5.023 , (ix) 5.023 , (x) p 0.37 . Solution: Remember the following: α) Only numbers like , p, 2 , and (the population mean, proportion, variance, standard deviation and median) that are parameters of the population can be in a hypothesis; x , p, s 2 , s and x.50 (the sample mean, proportion, variance, standard deviation and median) are statistics computed from sample data and cannot be in a hypothesis because a hypothesis is a statement about a population; β) The null hypothesis must contain an equality; γ) p must be between zero and one; δ) A variance or standard deviation cannot be negative. (i) p 1.2 can’t be either H 0 or H 1 since the sample proportion is not a parameter and a proportion can’t be above 1. (ii) p 0.35 can’t be either since a proportion can’t be negative (iii) (iv) (v) (vi) (vii) (viii) p .25 could be H 1 since it contains a parameter and an inequality. H 0 would be p .25 . x 5.023 , can’t be either H 0 or H 1 since the sample mean is not a parameter. 5.023 could be H 0 since it contains a parameter and an equality. H 1 would be 5.023 . s 5.023 can’t be either since the sample standard deviation is not a parameter. 5.023 can’t be H 0 or H 1 because the population standard deviation can’t be negative. 5.023 can be H 1 because it contains a parameter and an inequality. H 0 would be 5.023 . (ix) 5.023 could be H 0 since it contains a parameter and an equality. H 1 would be (x) p 0.37 . can be 5.023. . H 1 because it contains a parameter and an inequality. H 0 would be p 0.37. Before we start hypothesis testing here are three important points: α) You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. β) It is not enough to know what a critical value is. You should be able to explain on what side of the critical value the ‘reject’ zone lies. γ)The rule on p-value says that if the p-value is less than the significance level (alpha = ) reject the null hypothesis; if the p-value is greater than or equal to the significance level, do not reject the null hypothesis. δ) Make sure that I know what formulas you are using. 2. Make sure that I know what formulas you are using. Seymour Butz has invented a new golf ball. He claims that it will increase your distance off the tee by more than 20 feet. 40 golfers test the ball and the additional distances that the balls go are given on the next page. Use a 10% significance level. Is the ball a success? a) State your null and alternative hypotheses. b) Find critical values for the sample mean and test the hypothesis. c) Find a confidence interval for the population mean and test the hypothesis. d) Use a test ratio for a test of the mean e) Find an approximate p-value for the test ratio using the t table and use the p-value to test the hypothesis. Solution: We were given the following data: Data for Problem 2. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x 13.37 28.72 12.60 30.77 24.00 20.67 24.12 16.95 23.73 20.92 17.51 30.44 20.15 24.60 24.11 29.91 12.27 27.75 23.25 26.40 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 x2 178.757 824.838 158.760 946.793 576.000 427.249 581.774 287.303 563.113 437.646 306.600 926.594 406.023 605.160 581.292 894.608 150.553 770.063 540.563 696.960 14.01 196.280 28.86 832.900 23.34 544.756 25.00 625.000 17.01 289.340 30.50 930.250 25.16 633.026 13.40 179.560 22.03 485.321 18.99 360.620 21.61 466.992 27.01 729.540 26.59 707.028 30.22 913.248 17.06 291.044 26.16 684.346 24.13 582.257 16.45 270.603 18.25 333.063 21.99 483.560 900.01 21399.380 Given the nature of the data, it seems that we need to test thee mean, so we look at our formula table under means. Interval for Confidence Hypotheses Test Ratio Critical Value Interval Mean ( x 0 H0 : 0 x z 2 x xcv z 2 x z known) x H1 : 0 Mean ( unknown) x t 2 s x DF n 1 H0 : 0 t H1 : 0 x 0 sx xcv t 2 s x a) State your null and alternative hypotheses: What the problem says: ‘Seymour Butz has invented a new golf ball. He claims that it will increase your distance off the tee by more than 20 feet. 40 golfers test the ball and the additional distances that the balls go are given on the next page. Use a 10% significance level. Is the ball a success?’ -----This definitely implies that " 20 ” goes somewhere, but it does not contain an equality, so it must be an alternate hypothesis. The other clue is that the alternate hypothesis is often an ‘action’ hypothesis, while the null hypothesis generally says ‘keep things as they are.’ So the null hypothesis is H 0 : 20 and the alternate hypothesis is H1 : 20 . What else the problem says: A sample of x 900 .01 and was computed for you. So 0 20 .00, n 40, n 40 golfers is taken. .10 . The following information x 2 21399 .380 . x 900 .01 22.50025 , x n 40 s x2 x 2 nx 2 n 1 21399 .380 40 22 .50025 1148 .929998 29 .45974353 , s 29.45974353 5.42768 and 39 39 2 2 s x2 5.42768 2 0.73649 0.8582 . (It makes no sense to put 22.50 n 40 in the hypotheses when you know x 22.50 is true.) .10. This implies that s x b) Find critical values for the sample mean and test the hypothesis: Degrees of freedom are n 1 40 1 39. We must use t because we do not know . Because the alternate hypothesis is 20 , we need a critical value for x above 20. Since this means a one-sided test, we use t n1 t 39 1.304. The two-sided formula , x t n1 s , becomes x t n1 s .10 cv 0 2 x cv 0 x 20.000 1.304 0.8582 20.000 1.119 21.119 . Make a diagram showing an almost Normal curve with a mean at 20 and a shaded 'reject' zone above 21.119. Since x 22 .50025 , is above 21.119, we reject H0 . c) Find a confidence interval for the population mean and test the hypothesis: Because the alternate hypothesis is 20 we need a '' confidence interval. x t s x becomes x t s x 2 22.50025 1.304 0.8582 or 21.381 . Make a diagram showing an almost Normal curve with a mean at x 22 .50025 , and the confidence interval above 21.381 shaded. Since 0 20 is below 21.054 and thus not in the confidence interval, we reject H 0 . d) Use a test ratio for a test of the mean: The test ratio is t calc x 0 22 .50025 20 .000 2.913 . sx 0.8582 39 Since this is a one-sided, right-tail test, pick tn1 t.10 1.304 from the t table. The ‘reject’ zone is the area above 1.304. Since 2.913 is in the ‘reject’ zone, reject the null hypothesis. e) Find an approximate p-value for the test ratio and use the p-value to test the hypothesis: . On the 39 df line of the t-table, note that 2.913 is between 2.708 and 3.313. Because 2.708 is in the .005 column and 3.313 is in the .001 column, the table is telling us that Pt 2.708 .005 and Pt 3.313 .001 or t.005 tcalc t.001 , so .001 Pt 2.913 .005 , which means .001 p value .005. Since these p-values are below the 10% significance level, reject H 0 . Since we rejected the null hypothesis the ball is a success. Note: Many people incorrectly said that the null hypothesis was H 0 : 20 , so that the alternative was H1 : 20 . If this had been correct, the critical value would be below 20. Nevertheless, some concluded that they should reject their null hypothesis. I don’t think that you need to know any statistics at all to see that x 22.50 cannot possible be evidence against H 0 : 20 3 3. Assume, in problem 2, that the population standard deviation was 10ft. State the test ratio and find its p-value. Using this p-value, would we reject the null hypothesis a) If the significance level is 1%, b) If the significance level is 5% and c) If the significance level is 10%? Solution: Note!!! The only thing that has changed from problem 2 is that the sample standard deviation has been replaced by the population standard deviation of 10 – not the sample mean, not the hypotheses. The only reason for this section was because some find e) above difficult, and this is easier since you can use the Normal table. We still have the null hypothesis is H 0 : 20 and the alternate hypothesis H1 : 20 . What else the problem says: Assume, in problem 2, that 10 was a population standard deviation. Everything else is unchanged, except that there is no significance level at first. So 0 20 .000 , n 40 and x 22 .50025 . But x 10 and x x2 n x 0 10 2 2.500 1.5811 . We still have a right-tailed 40 22 .50025 20 .000 1.5813 . So If we use the 1.5811 Normal table, we find p value Px 22.50025 Pz 1.58 .5 .4429 .0571 . To do the last part of this problem, make a diagram of the Normal distribution with a mean of zero and shade the area above 1.58. test. The table says that the test ratio is z x a) If .01, since the p-value of .0571 is not below the significance level, we do not reject H 0 . b) If .05, since the p-value of .0571 is above the significance level, we do not reject H 0 . c) If .10 , since the p-value of .0571 is below the significance level, we reject H 0 . You did not answer this question if you did the problem 3 different ways and never found the pvalue! 4 4. Seymour is elected the mayor of the Borough of Pineapplnham and boasts to a reporter that at least 35% of the town has internet access. The reporter takes a fast survey and finds that 13 out of a sample of 53 have internet access. a) State your null and alternative hypotheses. b) Find a test ratio for a test of the proportion c) Find a p-value for the test ratio and use the p-value to test the hypothesis at the 5% significance level. d) Restate the problem as a two sided hypothesis, find a p-value for the null hypothesis and use it to test your hypothesis at a 5% significance level. Solution: Since 35% is a proportion, we look up the test for a proportion in the formula table. Interval for Confidence Hypotheses Test Ratio Critical Value Interval Proportion p p0 H 0 : p p0 p p z 2 s p pcv p0 z 2 p z H : p p p 1 0 pq p0 q0 sp p n n q 1 p q0 1 p0 a) What the problem says: ‘Seymour …. boasts to a reporter that at least 35% of the town has internet access’ 35% is a proportion. There is nothing here about the mean. The problem implies that the reporter is skeptical. Since ‘at least 35% is p .35 , there is an equality in this question, it must be an alternate hypothesis. So the null hypothesis is H 0 : p .35 and the alternate hypothesis is H 1 : p .35 . What else the problem says: A sample of 53 households found that 13 had internet access. So p0 .35, n 53 and x 13 . This implies that p x 13 .2453 and n 53 q0 1 p0 1 .35 .65. b) Find a test ratio for a test of the proportion: The table says z p p0 p and p0 q0 .35.65 .2453 .35 0.0042925 0.06552 . This means that z 1.60 . (It .06552 n 53 p p0 makes no sense to use the test ratio, z , if you have already said the null hypothesis is p p c) about the mean.) Find a p-value for the test ratio and use the p-value to test the hypothesis at the 5% significance level: Note that the alternate hypothesis says H 1 : p .35 . We are worried about the proportion being too low. We use the Normal table to find that the p value P p .2453 Pz 1.60 (.5 .4452 ) .0548 . So, if .05, since the p-value is above the significance level, we do not reject H 0 and cannot call Seymour a liar. d) Restate the problem as a two sided hypothesis, find a p-value for the null hypothesis and use it to test your hypothesis at a 5% significance level: The hypotheses are H 0 : p .35 and H 1 : p .35 . Since our value of z remains the same, p value 2P p .2453 2Pz 1.60 2(.5 .4452 ) 2(.0548 ) .0196 . . So, if .05 , since the p-value is above the significance level, we do not reject H 0 . 5 5. (Extra credit) a) Finish 4 by testing the original hypothesis using a critical value for the proportion and an appropriate confidence interval. b) Were we reasonable assuming that 10 in question 3? Use the sample standard deviation that you found in question 3 to test this using a test ratio and a confidence interval. Solution: Note: All of you should be able to do this problem using a critical value for p or a confidence interval. p0 q0 .35.65 0.0042925 0.06552 . The table says pcv p0 z 2 p , n 53 Critical Value: a) p but this is a one-sided test and we want one critical value below p 0 .35, so we use p cv p 0 z p .35 1.645 .06552 .2422 . Make a diagram showing a Normal curve centered p 0 .35 and a shaded 'reject' zone below .2422. Since p .2453 is not in the ‘reject’ zone, we cannot reject H 0 . at Confidence Interval: The table says p p z s p , where, because q 1 p 1 .2453 .7549 and 2 pq .2453 .7549 .00348465 .05903 . Since H 1 : p .35 . implies a 1-sided test we n 53 want a 1-sided confidence interval in the same direction as the alternate hypothesis. This would be p p z s p .2453 1.645.05903 .3424 . Make a diagram showing a Normal curve centered at n 53 , s p .2453 with the confidence interval below .3424 shaded. Since p0 .35 is above .3424 and thus not in the confidence interval, we reject H 0 . This is another one of those rare cases where the confidence interval contradicts the test ratio and critical value results. b) Were we reasonable assuming that 10 in question 3? Use the sample standard deviation that you found in question 3 to test this using a test ratio and a confidence interval: Go back to the formula table. Interval for Confidence Hypotheses Test Ratio Critical Value Interval VarianceH 0 : 2 02 n 1s 2 n 1s 2 .25 .5 2 02 2 2 2 2 Small Sample s cv .5 .5 2 02 n 1 H1: : 2 02 VarianceLarge Sample s 2DF H 0 : 2 02 z 2 2DF H1 : 2 02 x 2 z 2 2DF 1 2 s cv 2 DF z 2 2 DF nx 2 29 .45974353 , s 29.45974353 n 1 5.42768 and .10 . Degrees of freedom are DF n 1 40 1 39. This is a 2-sided test of H 0 : 10 and H 1 : 10 . You cannot find 2 for 39 degrees of freedom on your table, so use Test Ratio: Remember that n 40, 2 n 1s 2 02 39 29 .45974353 100 2 s x2 0.1149 . z 2 2 2DF 1 20.1149 239 1 .2298 77 .4794 8.7749 8.2956 . Your diagram would show a Normal curve with a center at zero and ‘reject’ zones above z z.05 1.645 and below z z.05 1.645 . Since -8.2956 is in the 2 2 lower ‘reject’ zone, reject H 0 . Confidence Interval: s 5.42768 . The table says 2 DF 239 78 8.8318 . This means s 2DF z 2 2DF and DF n 1 40 1 39. So, 5.42768 8.8318 47 .9362 47 .9362 or . This 1.645 8.8318 10 .4768 7.1868 6 gives us 4.575 6.670 , which contradicts H 0 : 10 , so assuming that the population standard deviation was 10 was not a good idea. 7 252grass2-051 2/09/05 Data for Problem 2. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x 13.37 28.72 12.60 30.77 24.00 20.67 24.12 16.95 23.73 20.92 17.51 30.44 20.15 24.60 24.11 29.91 12.27 27.75 23.25 26.40 x2 178.757 824.838 158.760 946.793 576.000 427.249 581.774 287.303 563.113 437.646 306.600 926.594 406.023 605.160 581.292 894.608 150.553 770.063 540.563 696.960 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 14.01 196.280 28.86 832.900 23.34 544.756 25.00 625.000 17.01 289.340 30.50 930.250 25.16 633.026 13.40 179.560 22.03 485.321 18.99 360.620 21.61 466.992 27.01 729.540 26.59 707.028 30.22 913.248 17.06 291.044 26.16 684.346 24.13 582.257 16.45 270.603 18.25 333.063 21.99 483.560 900.01 21399.380 8