252grass1 4/04/00 ECO 252 R. E. Bove Graded Assignment 1 Problem 1: Using the computational formula, find the sample variance of the following data and use it to compute a 95% confidence interval for the mean. x 5 -7 11 13 15 12 11 3 Solution: x 5 -7 11 13 15 12 11 3 63 1 2 3 4 5 6 7 8 x x 63 7.875 s x2 x 2 nx 2 x2 25 49 121 169 225 144 121 9 863 863 87.875 2 366 .875 52 .4107 7 7 n 1 s 7.2395 sx x 2.5596 s x 52 .4107 7.2395 n 8 The population variance is unknown, so we must use t instead of z. The general formula for a confidence interval is x tn1 s x . Here the degrees of freedom are n 1 8 1 7 and since the n 8 2 7 confidence level is 95%, 1 .95 or .05 . Also tn1 t .025 2.365 . 2 Putting this all together, (using the formula from Table 3 of the syllabus supplement or from the outline) we find x tn1 s x 7.875 2.365 2.5596 7.875 6.0535 or 2 1.8215 to 13.9285. More formally, P1.8215 13.9285 .95 252grass1 4/04/00 Problem 2: Find a symmetrical interval about the mean for the distribution x ~ N 5,6 with a probability of 84%. Solution: We first solve this problem for z. From the diagram we can see that we want the point z .08 . This is the point with a probability of .08 above it. Also from the diagram P0 z z.08 .4200 . The closest we can come to this probability using the Normal table is P0 z 1.41 .4207 . But P0 z 1.40 .4192 is almost as close. Either 1.40 or 1.41 would be acceptable, so that z .08 1.405 might be a good compromise. Thus x z 5 1.405 6 5 8.43 or -3.43 to 13.43. Problem 3: Using the value of z that you used in Problem 2, find an 84% confidence interval for the mean if x 7.035 , n 16 and 1.7 . Solution: The confidence level is 84%, so the significance level is 16%. This time 1.7 0.425 x 1.7 is known, so we can use z 2 z.08 1.405 . x x n 16 and (using the formula from Table 3 of the syllabus supplement or from the outline) x z x 7.035 1.405 0.425 7.035 0.5971 or 6.4379 to 7.6321. 2 252grass2 4/04/00 Graded Assignment 2 Problem 1: Which of the following could be a null hypothesis? 3, 3, 3, 3, 3 , 3 , x 3, x 3 , x 3, x 3, x 3 , x 3 Solution: Only 3 , 3 , 3 - because (i) H 0 only concerns parameters of the population and (ii) H 0 must contain equalities. Problem 2: A man walks into a bar. He drinks 15 bottles of beer. These bottles are supposed to contain 12 ounces of beer with a population standard deviation of 0.2 ounces. On the basis of the man's condition when he leaves the bar, we conclude that the sample mean for the bottles was 11.80 ounces. Test the hypothesis that the population mean for these bottles was 12 ounces. Assume that the confidence level is 95%. a) state your null and alternative hypotheses. b) Find critical values for the sample mean and test the hypothesis. c) Find a confidence interval for the sample mean and test the hypothesis. d) Use a test ratio for a test of the sample mean, find its p-value and use the p-value to test the hypothesis. H 0 : 12 Solution: a) From the problem statement 0 12, x 11 .80 , 0.2, n 15 and .05 . H 1 : 12 H 0 : 0 From the problem statement According to Table 3 or the outline, if we wish to test and is H 1 : 0 x 0 known, use Test Ratio z , Critical Value xcv 0 z x or Confidence Interval x x z 2 x , where x b) xcv 2 0.2 0.05164 and z z.025 1.960 . 2 n 15 0 z 2 x 12 1.9600.05164 12 0.101 or 11.899 to 12.101. Since x 11 .80 is not on this interval, reject H 0 . c) x z x 11.80 1.9600.05164 11.80 0.101 or 11.699 to 11.901. 2 Since 0 12 is not on this interval, reject H 0 . d) z x 0 x 11 .80 12 3.87 . Since this is a two-sided test p value 0.05164 2Pz 3.87 2.5 P 3.87 z 0 2.5 .4999 .0002 . Since p value .05 , reject H 0 . 252grass2 4/04/00 Problem 3: (Dummeldinger) According to the Chronicle of Higher Education Almanac, the average tuition (plus fees) for a private college in 1992 was $9083. A survey of 40 schools gave a sample mean of $9750 and a sample standard deviation of $1750. a) Test the validity of the Almanac's statement using a confidence level of 95%. b) Find an approximate p-value for the statement. H : 9083 Solution: a) 0 From the problem statement 0 9083 , x 9750 , s 1750 , n 40 and H 1 : 9083 H 0 : 0 H 1 : 0 .05 . From the problem statement According to Table 3 or the outline, if we wish to test and is unknown, use Test Ratio t x 0 , Critical Value xcv 0 t s x or Confidence Interval 2 sx 39 276 .70 and tn1 t .025 2.023 . 2 n 40 x 0 9750 9083 Test Ratio t 2.411 . Since this is not between sx 276 .70 t n1 2.023 , reject H . s x t 2 s x , where s x 2 1750 0 Or Critical Value xcv 0 t s x 9083 2.023276.70 9083 541.6 or 2 8541.4 to 9624.6. Since x 9750 is not on this interval reject H 0 . Or Confidence Interval x t s x 9750 541.6 or 9208.4 to 10291.6. 2 Since 0 9083 is not on this interval reject H 0 . b) Since this is a two-sided test p value 2Pt 2.411 (the value of t that we found in the previous part of the problem) for t with n 1 39 degrees 39 39 2.426 . Since 2.411 is of freedom. From the t table t .025 2.023 and t .01 between them .01 Pt 2.411 .025 and the p-value satisfies .02 2Pt 2.411 .05. (Since this p-value seems to be below .05 , reject H 0 .) 252grass3 4/14/00 Graded Assignment 3 1) Do Problem 15.59 in Excel as follows. Use columns A, B, C, and E on the Excel spreadsheet for data In the first row of A, B, C and D put in T1, T2, T3, T4 Now put in the data in columns A, B, C and E, skipping column D To fill column D in cell D2 write =E2 after your 'enter' this cell should read '12' Use the 'edit' pull-down menu and 'copy' cell D2 Use the 'edit' pull-down menu and 'paste' in cells D3 through D6. Now column D will be identical to E except for the heading. Save your data as data59.xls Use the 'tools' pull-down menu and pick 'data analysis' Pick 'ANOVA: Single Factor. Set input range to $A$1:$D$6 select 'New worksheet ply' and 'columns' , check 'labels in first row' hit 'OK' and save your results as result59.xls Take the last digit of your social security number (if it's zero, use 1). Go back to your original data or use the 'file' pull-down menu to open data59.xls. To fill column D this time in cell D2 write =E2-x replacing x with the last digit of your social security number. Use the 'edit' pull down menu and 'copy' cell D2 Use the 'edit' pull down menu and 'paste' in cells D3 through D6. Now column D will be less than E by the amount of your value of x. Save your data as data59a.xls. Run the ANOVA again and save your results as result59a.xls Submit the data and results with your social security number. Indicate what hypotheses were tested and, what the p-value was and whether, using the p-value, you would reject the null if (i) the significance level was 5% and (ii) the significance level was 10%, explaining why. You will have two answers for each of your two problems. For your second ANOVA do a normal and a Scheffe confidence interval for 1 4 , using the data in your ANOVA output. Solution: Original data T1 8 10 9 10 11 T2 6 9 8 8 7 T3 9 10 8 11 12 T4 12 13 10 11 11 Results Anova: Single Factor SUMMARY Groups Count T1 5 T2 5 T3 5 T4 5 Sum 48 38 50 57 Average Variance 9.6 1.3 7.6 1.3 10 2.5 11.4 1.3 ANOVA Source of SS df MS F P-value F crit Variation Between 36.95 3 12.31667 7.697917 0.002095 3.238867 Within G 25.6 16 1.6 Total 62.55 19 So the p-value was .002095 (0.2%), and, since this was less than both 5% and 10%, we would reject the null hypothesis for both significance levels. 252grass3 4/14/00 One version of modified data. I subtracted 7. T1 T2 T3 T4 Results 8 6 9 5 SUMMARY 10 9 10 6 Groups 9 8 8 3 T1 10 8 11 4 T2 11 7 12 4 T3 Anova: Single Factor T4 Count 5 5 5 5 Sum 48 38 50 22 Average Variance 9.6 1.3 7.6 1.3 10 2.5 4.4 1.3 ANOVA Source of SS df MS F P-value F crit Variation Between 98.2 3 32.73333 20.45833 1.18E-05 3.238867 Within G 25.6 16 1.6 Total 123.8 19 So the p-value was 0.0000118 (0.001%), and, since this was less than both 5% and 10%, we would reject the null hypothesis for both significance levels. i. A Single Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. 1 2 x1 x2 tn m s 2 1 1 , where s MSW . From the Excel output, x1 9.6, x4 4.4, n1 n2 16 1.6 n m 16, n1 n 4 5, MSW 1.6 . If 0.05 , 1 4 9.6 4.4 t .025 1 1 5 5 5.2 2.120 1.6 1 1 5.2 2.120 0.64 5.2 0.8 5 5 ii. Scheff e Confidence Interval If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals 1 1 between column means, use 1 2 x1 x2 m 1Fm 1, n m s . From the Excel n n2 1 output, x1 9.6, x4 4.4, m 4, n m 16, n1 n4 5, MSW 1.6 . If 0.05 , 1 4 9.6 4.4 3F.053,16 1.6 1 1 5.2 5 5 3 3.24 1.6 1 1 5.2 6.2208 5 5 5.2 2.49 . 2) Do the following problems. Remember: A null hypothesis must contain (i) an equality and (ii) parameters usually , , p, , p or . a) If 1 was the average amount spent by pharmaceutical firms in 1998 and 2 was the average amount spent by pharmaceutical firms in 1999 ( 1 2 ), was the average amount spent in 1999 above the average amount spent in 1998? Express H 0 and H 1 two ways, in terms of 1 and 2 , and in terms of . Solution: The problem asks if 2 was greater than 1 1 2 . Since this is not an equality, it must be H o : 1 2 the alternative hypothesis. So we have . Since 1 2 , it will be negative when H1 : 1 2 H o : 0 . H1 : 0 2 is greater than 1 , so we have 252grass3 4/04/00 b) If 1 was the average grade of traditional university students and 2 was the average grade of nontraditional university students, ( 1 2 ), do these differ? Express H 0 and H 1 two ways, in terms of 1 and 2 , and in terms of . Solution: The problem asks if 2 was different from 1 1 2 . Since this is not an equality, it must H : 2 H : 0 be the alternative hypothesis. So we have o 1 or o . H1 : 1 2 H1 : 0 c) If p1 is the proportion of people who will buy your product after seeing your commercial and p 2 was the proportion that bought your product before seeing the commercial ( p p1 p 2 ), does the commercial increase the proportion who will buy your product? Express H 0 and H 1 two ways, in terms of p1 and p 2 , and in terms of p . Solution: The problem asks if p1 was greater than p 2 p1 p2 . Since this is not an equality, it must be H o : p1 p2 the alternative hypothesis. So we have Since p p1 p2 , it will be positive when p1 is H1 : p1 p2 H o : p 0 greater than p 2 , so we have . H1 : p 0 d) Have new procedures decreased the variability of delivery times? Two samples have been taken and you know two sample standard deviations, s1 , taken before the new procedures were instituted, and s 2 , taken afterwards. Express H 0 and H 1 . Solution: The problem asks if 2 was less than 1 1 2 . Since this is not an equality, it must be the H : 2 s2 alternative hypothesis. So we have o 1 . (We would test 12 to see if it is larger than FDF1 ,DF2 . ) s2 H1 : 1 2 e) If DF 6 and t 1, find a range for the p-value in a 2-sided test. 6 Solution: We must find where 1 would fall on our t table. Our table says that t.15 1.134 and that 6 t 0.906 . So that for a 1-sided test, either .15 p value .20 or .80 p value .85 . But this is a 2.20 sided test so we double the smaller of these p-value pairs and say .30 p value .40 . 7