252x0421 3/17/04 (Page layout view!) ECO252 QBA2 Name SECOND HOUR EXAM Hour of Class Registered March 24, 2004 Circle 10am 11am Show your work! Make Diagrams! Exam is normed on 50 points. Answers without reasons are not usually acceptable. I. (8 points) Do all the following. x ~ N 2,9 - If you are not using the supplement table, make sure that I know it. 1. P27.88 x 0 2. P1 x 16 3. F 16 (The cumulative probability up to 16) 4. x.115 252x0421 3/17/04 II. (24+ points) Do all the following? (2points each unless noted otherwise). Note the following: 1. You will be penalized if you do not compute the sample variance of the x L column in question 1. 2. This test is normed on 50 points, but there are more points (48 plus extra credit!) possible including the take-home. You are unlikely to finish the exam and might want to skip some questions. 3. A table identifying methods for comparing 2 samples is at the end of the exam. 4. If you answer ‘None of the above’ in any question, you should provide an alternative answer and explain why. You may receive credit for this even if you are wrong. Questions 1-7 refer to Exhibit 1. Exhibit 1:(Edited from problems presented by Samuel Wathen (for Lind et. al. 2002) with one small error corrected) The first two columns below are evaluations of a sample of five products, first at FIFO and, second, at LIFO. Based on the results shown, is LIFO more effective than FIFO in keeping the value of inventory lower? (Assume that the underlying distribution is Normal.) d xF xL Product xF xL 1 2 3 4 5 225 119 100 212 248 904 221 100 113 200 245 879 4 19 -13 12 3 25 x F2 x L2 d2 50625 14161 10000 44944 61504 181234 48841 10000 12769 40000 60025 171635 16 361 169 144 9 699 Minitab calculated the following sample statistics: n Mean Median StDev SE Mean xF 5 180.8 212.0 66.7 29.8 xL 5 175.8 200.0 ____ d 5 5.00 4.00 11.98 Variable 5.36 1. Compute the standard deviation of x L . You may use any of the material given in exhibit 1. 2. What is the null hypothesis? a) F L b) c) F L F L F L d) e) None of the above. 2 252x0421 3/17/04 Exhibit 1:The first two columns below are evaluations of a sample of five products, first at FIFO and, second, at LIFO. Based on the results shown, is LIFO more effective than FIFO in keeping the value of inventory lower? (Assume that the underlying distribution is Normal.) xL d xF xL x F2 221 100 113 200 245 879 n 4 19 -13 12 3 25 Mean 50625 14161 10000 44944 61504 181234 Median 48841 10000 12769 40000 60025 171635 StDev 16 361 169 144 9 699 SE Mean xF 5 180.8 212.0 66.7 29.8 xL 5 175.8 200.0 ____ d 5 5.00 4.00 Product 1 2 3 4 5 xF 225 119 100 212 248 904 Variable d2 x L2 11.98 5.36 3. What is (are) the degrees of freedom? a) 4 b) 5 c) 8 d) 15 e) 10 4. If you used the 5% level of significance, what is the appropriate t or z value from the tables. a) 2.571 b) 2.776 c) 2.262 d) 2.228 e) 1.645 f) 1.960 g) None of the above. 5. What is the value of your calculated t or z ? a) 0.933 b) 2.776 c) 0.477 d) 2.028 e) None of the above. 3 252x0421 3/17/04 6. What is your decision at the 5% significance level? a) Do not reject the null hypothesis and conclude that LIFO is more effective in keeping the value of the inventory lower. b) Reject the null hypothesis and conclude that LIFO is more effective in keeping the value of the inventory lower. c) Reject the alternative hypothesis and conclude that LIFO is more effective in keeping the value of the inventory lower. d) Do not reject the null hypothesis and conclude that LIFO is not more effective in keeping the value of the inventory lower. e) None of the above. 7. Find an approximate p-value for the null hypothesis that you tested. Please explain your result! 8. A manufacturer revises a manufacturing process and finds a fall in the defect rate of 4% 5%. a) The fall in defects is statistically significant because 5% is larger than 4%. b) The fall in defects is statistically significant because the confidence interval supports H0. c) The fall in defects is not statistically significant because 4% is smaller than 5%. d) The fall in defects is not statistically significant because the confidence interval would lead us to reject H0. Questions 9-11 refer to Exhibit 2. Exhibit 2:(Edited from problems presented by Samuel Wathen) A group of adults and a group of children both tried Wow! Cereal. Was there a difference in how adults and kids responded to it? Number in Number who Fraction of Sample liked it sample who 250 187 liked it Adults .748 (Group 1) Children 250 100 66 .660 (Group 2) Total .748 .252 .000754 .660 .340 .002244 100 350 253 .723 .723 .277 .0005722 350 9. What is the null hypothesis ? a) 1 2 c) 1 2 1 2 d) p1 p 2 e) p1 p 2 b) f) p1 p 2 g) None of the above. 4 252x0421 3/17/04 10. Calculate a 99% confidence interval for the difference between the fraction of adults and fraction of kids that liked Wow! Explain why you reject or do not reject the null hypothesis. (4) 11. (Extra Credit)Calculate a 77% confidence interval for the difference between the fraction of adults and fraction of kids that liked Wow! (2) Questions 12-14 refer to Exhibit 3. Exhibit 3:(Edited from problems presented by Samuel Wathen) A survey was taken among a randomly selected 100 property owners to see if opinion about a street widening was related to the distance of front footage they owned. The results appear below. Opinion Front-Footage For Undecided Against Under 45 feet 12 4 4 45-120 feet 35 5 30 Over 120 feet 3 2 5 12. How many degrees of freedom are there? a) 2 b) 3 c) 4 d) 5 e) 9 f) None of the above. 13. What is the value of E for people in favor of the project who own less than 45 feet of frontage ? a) 10 b) 12 c) 35 d) 50 e) None of the above. 14. Assume that the computed value of chi square is 8.5 a) What is the null hypothesis that you are testing ? (2) b) What is your conclusion ? Why ? (3) 15. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3 points - 2 point penalty for not handing this in.) 5 252x0421 3/17/04 16. The following output is from a computer problem very much like the one you did to compare two sets of data. Two production processes are in use. I wish to compare numbers of defects in Process A and Process B to test the statement “ The number of defects in process A is significantly lower than in process B.” Three tests are done. Assume that the underlying distribution is Normal. a)Which of the three tests should we use? b) What is the null hypothesis as we use it? c) Should we reject the null hypothesis? Why? Test 1: MTB > twosamplet 'A' 'B' Two-Sample T-Test and CI: A, B Two-sample T for A vs B N A 90 B 110 Mean 220.5 300.5 StDev SE Mean 34.7 3.7 82.7 7.9 Difference = mu A - mu B Estimate for difference: -79.98 95% CI for difference: (-97.15, -62.81) T-Test of difference = 0 (vs not =): T-Value = -9.20 P-Value = 0.000 DF = 152 Test 2: MTB > twosamplet 'A' 'B'; SUBC> alter 1. Two-Sample T-Test and CI: A, B Two-sample T for A vs B N A 90 B 110 Mean 220.5 300.5 StDev SE Mean 34.7 3.7 82.7 7.9 Difference = mu A - mu B Estimate for difference: -79.98 95% lower bound for difference: -94.36 T-Test of difference = 0 (vs >): T-Value = -9.20 P-Value = 1.000 DF = 152 Test 3: MTB > Twosamplet 'A' 'B'; SUBC> alter -1. Two-Sample T-Test and CI: A, B Two-sample T for A vs B N A 90 B 110 Mean 220.5 300.5 StDev SE Mean 34.7 3.7 82.7 7.9 Difference = mu A - mu B Estimate for difference: -79.98 95% upper bound for difference: -65.59 T-Test of difference = 0 (vs <): T-Value = -9.20 P-Value = 0.000 DF = 152 6 252x0421 3/17/04 17. (Extra credit) My boss objects that he thinks that the variances are equal, so that I used the wrong test. I go back to the computer and do the following. (The null hypothesis is equal variances.) Was I right? Why? MTB > %VarTest c3 c4; SUBC> Unstacked. Test for Equal Variances F-Test (normal distribution) Test Statistic: 0.176 P-Value : 0.000 18. (Extra Credit)Now my beloved boss says that maybe the underlying distribution is not Normal. I go back to the computer and run the following. Process A results are in C3. Process B results are in C4. Remember that there are 90 data items for process A and 100 for process B. What are our hypotheses and results? MTB > Stack c3 c4 c5; SUBC> Subscripts c6; SUBC> UseNames. MTB > Rank c5 c7. MTB > Unstack (c7); SUBC> Subscripts c6; SUBC> After; SUBC> VarNames. This stacks the 2 sets of results together so they can be ranked. C7 now contains the ranks. Ranks for A are now in C7_A. Ranks for B are now in C7_B. MTB > sum c8 Sum of C7_A Sum of C7_A = 6008.0 MTB > sum c9 Sum of C7_B Sum of C7_B = 14092 7 252x0421 3/17/04 Questions 19-22 refer to Exhibit 4. Exhibit 4:(Edited from problems presented by Samuel Wathen) A professor asserts that she uses a Normal curve with a mean of 75 and a standard deviation of 10 to grade students. Last year’s grades are below. Test to see if the professor’s assertions are correct at the 99% confidence level. Row Grade Interval 1 2 3 4 5 A B C D F 90+ 80-90 70-80 60-70 Below 60 E 7.6820 27.7955 44.0450 27.7955 7.6820 115.0000 O2 O 15 20 40 30 10 115 E 29.2892 14.3908 36.3265 32.3793 13.0174 125.4032 19. Show the calculations necessary to get the number that were expected to get B’s. 20. What table value of chi-square would you use to test the professor’s assertion? 21. What is the calculated value of chi-square? 22. Explain your conclusion. 8 252x0421 3/17/04 (mostly blank page) Location - Normal distribution. Compare means. Location - Distribution not Normal. Compare medians. Paired Samples Method D4 Independent Samples Methods D1- D3 Method D5b Method D5a Proportions Method D6 Variability - Normal distribution. Compare variances. Method D7 9 252x0421 3/17/04 ECO252 QBA2 SECOND EXAM March 24, 2004 TAKE HOME SECTION Name: _________________________ Student Number: _________________________ III. Neatness Counts! Show your work! Always state your hypotheses and conclusions clearly. (19+ points) 1) Chi-squared and Related Tests (Bassett et. al.) To personalize the data below, change the number of stations reporting 4 thunderstorms to the second to last digit of your student number. This will change the total number of stations reporting. For example, Seymour Butz’s student number is 976500, so he will change the number of stations reporting 4 thunderstorms to zero and the total number of stations reporting will be 22 + 37 + 20 + 13 + 0 + 2 = 94. a) 100 weather stations reported the following in August 2003: Number of Thunderstorms x 0 1 2 3 4 5 Number of stations reporting x 22 37 20 13 6 2 thunderstorms O In the region in question, the number of thunderstorms per month is believed to have a Poisson distribution with a mean of 1. Test to see if this is appropriate using a chi-squared method. For example if, 5 stations reported 2 thunderstorms and 5 stations reported 3 thunderstorms and there were only 10 stations, the total number of storms reported would be 25 35 25 , and the average number of storms reported would be 25 2.5 . (4) 10 b) Repeat the test using the Kolmogorov-Smirnov method. (3) c) Find the average number of storms per station and use it to generate a Poisson table on Minitab. To do so follow the example below, replacing 0.732 with your mean (a number like 1.723). Head Column 1 (C1) k , column 2 Pk and column 3 Px le k or something similar.. ( ' le' stands for ' ' ) In column 1 place the numbers 0 through 10. MTB > PDF c1 c2; SUBC> Poisson 0.732. MTB > CDF c1 c3; SUBC> Poisson 0.732. MTB > print c1 - c3 Data Display Row k P(k) P(x le k) 1 0 0.480946 0.48095 2 1 0.352053 0.83300 3 2 0.128851 0.96185 4 3 0.031440 0.99329 5 4 0.005753 0.99904 6 5 0.000842 0.99989 7 6 0.000103 0.99999 8 7 0.000011 1.00000 9 8 0.000001 1.00000 10 9 0.000000 1.00000 11 10 0.000000 1.00000 10 252x0421 3/17/04 This table tells us that, for a Poisson distribution with a mean of 0.732, Px 3 .031440 and Px 3 .99329 . To keep the numbers correct, you could merge the data for k = 5 to 10 into a category of ‘5 or more storms.’ Decide whether a chi-squared or K-S method is appropriate (Only one method is!) and test for a Poisson distribution with your mean, remembering that you estimated the mean from your data. (4) d) (Extra Credit) Two dice were thrown 180 times with the results below. Test the hypothesis that the distribution follows the binomial distribution with n 2 and p .15 . (2) Number of Sixes x 0 1 2 Frequency O 105 70 5 e) (Extra extra credit) Test the data in d) for a binomial distribution in general by using pˆ Total number of sixes (2) Total number of throws 11 252x0421 3/17/04 2) (Meyer and Krueger) WEFA compiled the following random samples of single-family home prices in the eastern and western parts of the US (in $thousands.). (Note – in this problem it is OK to use Excel or Minitab as a help – but you must fool me into believing that you did it by hand.) Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 City - E x1 Albany NY Allentown PA Baltimore MD Bergen NJ Boston MA Buffalo NY Charlestown SC Charlotte NC Greensboro NC Greenville SC Harrisburg PA Hartford CT Middlesex NJ Monmouth NJ New Haven CT New York NY Newark NJ Philadelphia PA Raleigh/Durham NC Rochester NY Springfield MA Syracuse NY Washington DC 108.607 85.250 112.747 195.232 180.865 83.122 92.840 104.433 97.638 88.355 79.846 129.130 169.540 137.859 134.856 170.830 187.128 114.553 119.355 85.043 102.678 82.372 155.176 City-W x2 Bakersfield CA Fresno CA Orange C. CA Portland OR Riverside CA Sacramento CA San Diego CA San Francisco CA San Jose CA Seattle WA Stockton CA Tacoma WA 137.171 107.627 204.862 123.605 123.836 120.232 172.601 220.067 224.828 147.854 98.440 119.884 City-No 1 2 3 4 5 6 7 8 9 10 11 12 These are available on the website in Minitab. Minitab reports the following sample statistics. Variable x1 x2 n 23 12 Mean 122.50 150.10 Median 112.75 130.50 StDev 37.20 44.50 You may use the statistics given for x1, but personalize the data for Western cities as follows: Use the fourth digit of your student number to pick the first city to be eliminated and then eliminate the third city after that. (You may, if you wish, drop the last two digits of the prices in the Western Cities.) For example, Seymour Butz’s student number is 976500, so he will use the number 5 to eliminate cities 5 (Riverside) and 8 (San Francisco). If the fourth digit of your student number is zero, eliminate cities 10 and 1. You will thus have only 10 cities in your second sample. a. Compute a (mean and) standard deviation for your personalized second sample. Show your work! (2) b. Test to see if there is a significant difference between the mean home prices in the eastern and western US. You may assume that the samples come from Normal populations with equal variances, though there are 2 points extra credit if you do not assume equal variances. You may use a test ratio, critical value or a confidence interval (4 points) or all three of these (6 points – assuming that you get the same conclusion for all of them) . c. Test the variances to find out if you were or would have been justified to assume equality of variances. Were you? (2) d. (Extra Credit)Use a Lilliefors test to see if the Western data is Normally distributed. (2) e. (Extra Credit) Assume that the data is not normally distributed and test to see if there is a significant difference between the medians. (3) 12