252solngr2-071 2/19/07 ECO 252 Second Graded Assignment R. E. Bove Name: Class days and time: Student Number: Please include class and student number on what you hand in! Papers should be stapled. Your writeup should state clearly what you did and concluded. Problem 1: Which of the following could be a null hypothesis? Which could be an alternative hypothesis? Which could be neither? Why? (i) p 3 , (ii) p .3 , (iii) p .3 , (iv) 3 , (v) 3 , (vi) 3 , (vii) s 3 , (viii) 3 , (ix) 5 , (x) p .3 , (xi) 3 , (xii) 0 ,(xiii) p .5. Problem 2: A rental firm believes that that the average time that a backhoe is rented is 4.2 days. In order to verify this statement, a sample of rental records is taken, with the following results. 4, 2, 4, 3, 6, 2, 3, 2, 5, 3, 2, 4, 2, 2 You presumably know that n 14 , x 3.1429 , and x 1 .1x . ( x is the last digit of your student number.) Test the hypothesis that the mean is 4.2. Assume that the confidence level is 95%. a) State your null and alternative hypotheses. b) Find critical values for the sample mean and test the hypothesis. Show your reject region on a diagram c) Find a confidence interval for the sample mean and test the hypothesis. Show your results on a diagram. d) Use a test ratio for a test of the sample mean. Show your reject region on a diagram e) Find a p-value for the null hypothesis using the Normal table and use the p-value to test the hypothesis. Problem 3: (Keller & Warrack) A random sample of 11 young adult men was asked how many minutes of sports they watched daily. Results are below. 50 48 65 74 66 30 40 60 60 60 50 Personalize the data as follows: add the digits of your student number to the last six numbers. Example: Ima Badrisk has the student number 123456; so the last six numbers become {31, 42, 63, 64, 65, 56}. Compute the sample standard deviation using the computational formula (if you don’t know what that means, find out!). (If you did this correctly on the last assignment, just copy your work.) Test the hypothesis that average time watching sports is below 70 hours. a) Test the validity of the hypothesis using a confidence level of 95% and a critical value for the sample mean. (You cannot test the validity of a hypothesis that you haven’t stated!) b) Find an approximate p-value for the statement. c) Will we reject the hypothesis at a significance level of (i) .001? (ii) .01? (iii) .10? Using the p- value explain why. Note that none of the problems beyond this point involve sample means. 252solngr2-071 2/19/07 Problem 4: (Ken Black) It is generally believed that 79% of US companies offer flexible scheduling . A survey of 415 accounting firms finds that 300 x have flexible scheduling. ( x is the 2nd to last digit of your student number.) Test the hypothesis that the proportion of accounting firms that offer flexible hours is below that for other US firms. a) State your null and alternative hypotheses. b) Find a test ratio for a test of the proportion. c) Find a p-value for this ratio and use it to test the hypothesis at a 5% significance level. Extra Credit Problem 5: a) Finish problem 4 by finding an appropriate confidence interval for the proportion and showing whether it contradicts the null hypothesis. b) Use the data in problem 3 to test the hypothesis 10 . c) Use Minitab to check your answer to problem 4. Do this three ways First: Enter Minitab. Use the Editor pull-down menu to enable commands. Then enter the commands below. Pone 415 300+ x ; (Replace 300+x with the number you used.) Test 0.79; Alter -1; (Makes H1 ‘less than.’) useZ. (Uses normal approx. to binomial) Second: Use the Stat pull-down menu. Choose ‘Basic Stat’ and then ‘1 proportion.’ Check ‘summarized data’ and enter your n and 300+ x . Press Options. Set ‘test proportion’ as 0.79, alternative hypothesis as ‘less than’ and check ‘Normal distribution.’ Go. Third: Use the pull-down menu again. But before you start put 10 x yesses and 50 x noes in column 1. Uncheck ‘summarized data’ and let Minitab know that the data are in column 1 (C1). Other options are unchanged. 252solngr2-071 2/19/07 Problem 1: Which of the following could be a null hypothesis? Which could be an alternative hypothesis? Which could be neither? Why? (i) p 3 , (ii) p .3 , (iii) p .3 , (iv) 3 , (v) 3 , (vi) 3 , (vii) s 3 , (viii) 3 , (ix) 5 , (x) p .3 , (xi) 3 , (xii) 0 ,(xiii) p .5. Solution: Remember the following: α) Only numbers like , p, 2 , and (the population mean, proportion, variance, standard deviation and median) that are parameters of the population can be in a hypothesis; x , p, s 2 , s and x.50 (the sample mean, proportion, variance, standard deviation and median) are statistics computed from sample data and cannot be in a hypothesis because a hypothesis is a statement about a population; β) The null hypothesis must contain an equality; γ) p must be between zero and one; δ) A variance or standard deviation cannot be negative. (i) p 3 could not be H 0 or H 1 since it contains a unreasonable value for a parameter. (ii) p .3 could not be H 0 or H 1 because p is a sample statistic, not a parameter. (iii) p .3 could be H 0 since it contains a parameter and an equality. H 1 would be p .3 . (iv) 3 , could be H 0 since it contains a parameter and an equality. H 1 would be 3 . (v) 3 can’t be either H 0 or H 1 since a population standard deviation cannot be negative. (vi) 3 can be H 0 since it contains a parameter and an equality. H 1 would be 3 . (vii) s 3 can’t be either H 0 or H 1 since the sample standard deviation is not a parameter. (viii) 3 can be H 0 since it contains a parameter and an equality. H 1 would be 3 (ix) 5 can be H 1 since it contains a parameter and a strict inequality. H 0 would be 5 . (x) p .3 could not be H 0 or H 1 since p, a parameter, cannot take values below zero or above 1. (xi) 3 could not be H 0 or H 1 since , a parameter, cannot take values below zero. (xii) 0 could be H 0 since it contains a parameter and an equality. H 1 would be 0 . However, note that this H 0 says that x is a constant. (xiii) p .5 can’t be either H 0 or H 1 since the sample proportion is not a parameter. Learn to make and call it ‘mu.’ It’s not a ‘u’ and you are too young to be unable to adjust to using a Greek letter! 252solngr2-071 2/19/07 Problem 2: A rental firm believes that that the average time that a backhoe is rented is 4.2 days. In order to verify this statement, a sample of rental records is taken, with the following results. 4, 2, 4, 3, 6, 2, 3, 2, 5, 3, 2, 4, 2, 2 You presumably know that n 14 , x 3.1429 , and x 1 .1x . ( x is the last digit of your student number.) Test the hypothesis that the mean is 4.2. Assume that the confidence level is 95%. a) State your null and alternative hypotheses. b) Find critical values for the sample mean and test the hypothesis. Show your reject region on a diagram c) Find a confidence interval for the sample mean and test the hypothesis. Show your results on a diagram. d) Use a test ratio for a test of the sample mean. Show your reject region on a diagram e) Find a p-value for the null hypothesis using the Normal table and use the p-value to test the hypothesis. H : 4.2 Solution: a) 0 From the problem statement 0 4.2, x 3.1429 , 1to 1.9, n 14 and H 1 : 4.2 H 0 : 0 H 1 : 0 .05 . From the problem statement According to Table 3 or the outline, if we wish to test and is known, use Test Ratio z x z 2 x , where x (ii) 1.9 n x 0 x , Critical Value xcv 0 z x or Confidence Interval 2 and z z.025 1.960 . The standard error is (i) 2 1 14 1 0.2673 to 14 3.61 0.5078 14 14 b) Critical Values: (i) xcv 0 z x 4.2 1.9600.2673 4.2 0.5239 or 3.6761 to 4.7239. 2 Make a diagram! Show a normal curve with a mean at 4.2 and shaded rejection regions below 3.6761 and above 4.7239. (ii) xcv 0 z x 4.2 1.9600.5078 4.2 0.9953 or 3.2047 to 5.1953 2 Make a diagram! Show a normal curve with a mean at 4.2 and shaded rejection regions below 3.2047 and above 5.1953. Since x 3.1429 , is in the lower rejection region , reject H 0 . c) Confidence intervals: (i) x z x 3.1429 1.9600.2673 3.1429 0.5239 or 2.6190 2 to 3.6668. Make a diagram! Show a normal curve with a mean at 4.2 and shade the confidence interval between 2.6190 and 3.6668. (ii) x z x 3.1429 1.9600.0.5078 3.1429 0.9953 or 2.1476 to 4.1382. Make a diagram! 2 Show a normal curve with a mean at 4.2 and shade the confidence interval between 2.1476 and 4.1362. Since 0 4.2 is not on these intervals, reject H 0 . d) (i) z x 0 x x 0 3.1429 4.2 3.1429 4.2 3.95 . (ii) z 2.08 Make a 0.2673 x 0.5078 diagram! Show a normal curve with a mean at 0 and shaded rejection regions below -1.960 and above 1.960. Since our value of z is in the lower rejection region e) Since this is a two-sided test (i) p value 2Pz 3.95 2.5 P 3.95 z 0 2.5 .5000 .0000 . (ii) p value 2Pz 2.08 2.5 P 2.08 z 0 2.5 .4812 .0376 . Since p value .05 , reject H 0 . 252solngr2-071 2/19/07 Problem 3: (Keller & Warrack) A random sample of 11 young adult men was asked how many minutes of sports they watched daily. Results are below. 50 48 65 74 66 30 40 60 60 60 50 Personalize the data as follows: add the digits of your student number to the last six numbers. Example: Ima Badrisk has the student number 123456; so the last six numbers become {31, 42, 63, 64, 65, 56}. Compute the sample standard deviation using the computational formula (if you don’t know what that means, find out!). (If you did this correctly on the last assignment, just copy your work.) Test the hypothesis that average time watching sports is below 70 hours. a) Test the validity of the hypothesis using a confidence level of 95% and a critical value for the sample mean. (You cannot test the validity of a hypothesis that you haven’t stated!) b) Find an approximate p-value for the statement. c) Will we reject the hypothesis at a significance level of (i) .001? (ii) .01? (iii) .10? Using the p-value explain why. Note that none of the problems beyond this point involve sample means. Solution: Two data sets for computations of the variance are shown here. They will be referred to as solution 1 and solution 2. The first represents a student number of 000000 and the second 999999. x12 x1 index 1 2 3 4 5 6 7 8 9 10 11 50 48 65 74 66 30 40 60 60 60 50 603 2500 2304 4225 5476 4356 900 1600 3600 3600 3600 2500 34661 x n1 n 2 11 , 1 The means are x1 s12 s x1 s 22 x 2 1 nx12 n 1 s1 n x 2 2 x 22 x2 50 48 65 74 66 39 49 69 69 69 59 657 x 603 , x 1 n 2500 2304 4225 5476 4356 1521 2401 4761 4761 4761 3481 40547 2 1 34661 , x 2 x 657 and 603 54 .8182 and x 2 11 x n 2 2 2 40547 . 657 59 .7273 . 11 34661 1154 .8182 2 1605 .61 160 .56 s1 160 .56 12 .671 10 10 s12 160 .56 14 .5965 3.821 n 11 nx 22 n 1 40547 1159 .7273 2 1306 .15 130 .61 s 2 130 .61 11 .429 10 10 s 22 130 .61 11 .874 3.446 n 11 n H 0 : 70 , H 1 : 70 s x2 s2 a) From Table 3 x cv tn 1 s x is the formula for a two sided critical value when the population 2 standard deviation is unknown. 1 .95 .05 n 1 10 t .05 1.812 critical value below 70 so use t .05 2 .025 10 t n1 t.025 2.228 , but we want a 2 252solngr2-071 2/19/07 Solution 1: xcv tn1 s x 70 1.8123.821 70 6.9237 63.0763 Make a diagram. Show an almost-Normal curve with a mean at 70, and ‘reject’ region below 63.0763. Since x1 54 .8182 falls in the ‘reject’ region, reject the null hypothesis. Solution 2: x cv tn 1 s x 70 1.812 3.446 70 6.2442 63 .7558 Make a diagram. Show an 2 almost-Normal curve with a mean at 70 and a ‘reject’ region below 63.7558. Since x 2 59 .7273 falls in the ‘reject’ region, reject the null hypothesis. x 0 b) From Table 3, t calc . Since this is a left-sided test we want Pt t calc . Solution 1: sx 54 .8182 70 2.193 . According to the t-table Pt 2.764 P(t 2.764 ) .01 , 6.9237 Pt 2.228 P(t 2.228 ) .025 and Pt 1.812 P(t 1.812 ) .05 . t calc c) Since our calculated t is between the last two values, we can say .01 pvalue .05 . Since this is below 10%, we reject the null hypothesis for a significance level of 10%, but not for 1% or 0.1%. Problem 4: (Ken Black) It is generally believed that 79% of US companies offer flexible scheduling . A survey of 415 accounting firms finds that 300 x have flexible scheduling. ( x is the 2nd to last digit of your student number.) Test the hypothesis that the proportion of accounting firms that offer flexible hours is below that for other US firms. a) State your null and alternative hypotheses. b) Find a test ratio for a test of the proportion. c) Find a p-value for this ratio and use it to test the hypothesis at a 5% significance level. Solution: a) H 0 : p .79 , H 1 : p .79 b) z p p0 p . p p0 q0 300 309 .79 .21 .7229 to p .7446 . .00039976 .01999 p 415 415 n 415 .7229 .79 .7446 .79 3.35 to 2.27 .01999 .01999 c) Pz 3.35 .5 .4996 .0004 and Pz 2.27 .5 .4884 .0116 Since these are both below .05, they would result in a rejection of the null hypothesis. Thus z Extra Credit Problem 5: a) Finish problem 4 by finding an appropriate confidence interval for the proportion and showing whether it contradicts the null hypothesis. pq Solution: The two sided interval for a proportion is p p z s p where s p and q 1 p . We 2 n know that p .7229 to p .7446 . For a one-sided test, p p z s p , where z z .05 1.645 . For the .7229 .2771 .0004827 .02197 . The interval is 415 p .7229 1.645 .02197 .7590 . For the second value q 1 p 1 .7446 .2554 , so that first value q 1 p 1 .7229 .2771 , so that s p thus .7446 .2554 .0004582 .02141 . The interval is thus p .7446 1.645 .02141 .7798 . No 415 matter which of these is true, saying p .7590 or p .7798 contradicts the null hypothesis H 0 : p .79 . sp 252solngr2-071 2/19/07 b) Use the data in problem 3 to test the hypothesis 10 . Solution: The test ratio for this problem is 2 n 1s 2 02 and for 10 degrees of freedom and a two sided 10 20 .4832 and 210 3.2470 . The two rejection zones are below 3.2470 and above 5% test, .2025 .975 20.4832. We had two solutions. s12 x 2 1 nx12 n 1 34661 1154 .8182 2 1605 .61 160 .56 or 10 10 2 s1 160 .56 12 .671 . This means that our calculated test ratio is calc For s 22 2 calc x 2 2 nx 22 n 1 n 1s 2 02 n 1s 2 02 10 160 .56 16 .06 . 100 40547 1159 .7273 2 1306 .15 130 .61 or s 2 130 .61 11 .429 , 10 10 10 130 .61 13 .06 . Since neither of these is in the ‘reject zone described above, do not 100 reject the null hypothesis. We, of course can do this as a confidence interval or a critical value. Since this is a 2-sided test, the confidence interval has the form 2 us n 1s 2 20 .4832 2 n 1s 2 3.2470 or n 1s 2 .25 .5 2 10 160 .56 2 10 160 .56 20 .4832 2 includes 10. If we want critical values, use s cv 3.2470 .25 .5 2 02 n 1 . For the first solution, this would give or 8.85 22.24 , an interval that . The two critical values would be 20 .4832 10 2 3.2470 10 2 2 204 .83 and s cv 32 .47 . Since both values of s 2 fall within these 10 10 critical values, we cannot reject the null hypothesis. 2 s cv c) Use Minitab to check your answer to problem 4. Do this three ways First: Enter Minitab. Use the Editor pull-down menu to enable commands. Then enter the commands below. Pone 415 300+ x ; (Replace 300+x with the number you used.) Test 0.79; Alter -1; (Makes H1 ‘less than.’) useZ. (Uses normal approx. to binomial) Second: Use the Stat pull-down menu. Choose ‘Basic Stat’ and then ‘1 proportion.’ Check ‘summarized data’ and enter your n and 300+ x . Press Options. Set ‘test proportion’ as 0.79, alternative hypothesis as ‘less than’ and check ‘Normal distribution.’ Go. Third: Use the pull-down menu again. But before you start put 10 x yesses and 50 x noes in column 1. Uncheck ‘summarized data’ and let Minitab know that the data are in column 1 (C1). Other options are unchanged. 252solngr2-071 2/19/07 ————— 2/16/2007 6:30:04 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > SUBC> SUBC> SUBC> pone 415 300; test 0.79; alter -1; usez. #Done using commands Test and CI for One Proportion Test of p = 0.79 vs p < 0.79 95% Upper Sample X N Sample p Bound 1 300 415 0.722892 0.759030 #Same pvalue as I got for x=300. MTB > POne 415 309; SUBC> Test 0.79; SUBC> Alternative -1; SUBC> UseZ. Z-Value -3.36 P-Value 0.000 #Done using menu Test and CI for One Proportion Test of p = 0.79 vs p < 0.79 95% Upper Sample X N Sample p Bound 1 309 415 0.744578 0.779790 #Same pvalue as I got for x=309. Z-Value -2.27 P-Value 0.012 MTB > print c1 Data Display C1 y n n n y n n n y n n n y n n y n n y n n y n n y n n y n n y n n y n n n n n n n n MTB > POne C1; SUBC> Test 0.79; SUBC> Alternative -1; SUBC> UseZ. Test and CI for One Proportion: C1 Test of p = 0.79 vs p < 0.79 Event = y Variable C1 X 11 N 60 Sample p 0.183333 95% Upper Bound 0.265500 Z-Value -11.54 P-Value 0.000 n n n n n n n n n n n n n n n n n n