11/14/03 252x0323 (Page layout view!) ECO252 QBA2 Name SECOND HOUR EXAM Hour of Class Registered October 30, 2003 Circle 11am 12:30pm I. (53 points) Do all the following? (2points each unless noted otherwise). Note the following: 1. You will be penalized if you do not compute the sample variance of the d column in question 20, so you might want to do it now. 2. This test is normed on 50 points, but there are 74 points possible including the take-home. You may not finish the exam and might want to skip some questions. 3. A table identifying methods for comparing 2 samples is at the end of the exam. 1. A manufacturer revises a manufacturing process and finds a fall in the defect rate of 5% 4%. a) The fall in defects is statistically significant because 5% is larger than 4%. b) The fall in defects is statistically significant because the confidence interval supports H0. c) The fall in defects is not statistically significant because 4% is smaller than 5%. d) The fall in defects is not statistically significant because the confidence interval would lead us to reject H0. 2. If we wish to determine whether there is evidence that the proportion of successes is higher in group 1 than in group 2, the appropriate test to use is a) the z test. 2 b) the test. c) both of the above d) none of the above TABLE 12-14 Recent studies have found that American children are more obese than in the past. The amount of time children spend watching television has received much of the blame. A survey of 100 ten-year-olds revealed the following with regards to weights and average number of hours a day spent watching television. We are interested in testing whether the average number of hours spent watching TV and weights are independent at 1% level of significance. Weights More than 10 lbs. overweight Within 10 lbs. of normal weight More than 10 lbs. underweight Total 3. 0-3 1 20 10 31 TV Hours 3-6 9 15 5 29 6+ 20 15 5 40 Referring to Table 12-14, if there is no connection between weights and average number of hours spent watching TV, we should expect how many children to be spending 3-6 hours, on average, watching TV and are more than 10 lbs. underweight? a) 5 b) 5.8 c) 6.2 d) 8 Total 30 50 20 100 10/24/03 252x0323 4. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3 points - 2 point penalty for not handing this in.) MTB > TwoT 90.0 'educ' 'sex'; SUBC> Alternative -1. Two-Sample T-Test and CI: EDUC, SEX Two-sample T for EDUC SEX Female Male N 788 651 Mean 13.19 13.28 StDev 3.03 2.85 SE Mean 0.11 0.11 Difference = mu (Female) - mu (Male ) Estimate for difference: -0.091 90% upper bound for difference: 0.108 T-Test of difference = 0 (vs <): T-Value = -0.58 P-Value = 0.280 DF = 1412 The computer output above refers to a test very much like the Minitab test you ran of two independent samples. The major difference is that 1439 numbers appear in column 1 (labeled EDUC) which give number of years of education completed and the computer sorted them by gender using the words ‘female’ and ‘male’ in column 5 (labeled SEX). The variable x F can thus refer to an imaginary column of female education figures and x M to in imaginary column of male education figures. Call this the GSSEduc output. 5. Referring to the GSSEduc output, and using the rules taught in class, the null hypothesis that was tested is . a) H0: F – M 0 b) H0: F – M 0 c) H0: F – M 0 d) H0: F – M 0 6. Referring to the GSSEduc output, we can conclude, (doing no more calculations) that, for the particular population that was sampled a) At the .10 level, there is sufficient evidence that women had fewer years of education than men. b) At the .10 level, there is a difference between the years of education gotten by men and women. c) At the .10 there is insufficient evidence that the average men’s education level is higher than the women’s. d) At the .10 level, there is sufficient evidence to conclude that there is no difference between men’s and women’s education level. 7. Referring to the GSSEduc output, the most commonly used methods to find degrees of freedom are (i) to calculate df n1 n 2 2 788 651 2 1437 , or (ii) to say that since we have large sample to use z , which is equivalent to saying that the degrees of freedom are infinite, yet the computer claims df 1412 . Explain, briefly, what the computer probably did (and assumed) to get that number. 2 10/24/03 252x0323 8. (Wonnacott and Wonnacott) A small piece of hose in the cooling system of a new engine has a lifetime that varies normally (following the Normal distribution) around a mean of 18 months with a standard deviation of 4 months. The first maintenance check occurs at 12 months. What is the probability that the hose will wear out before the maintenance check? (This is the same as the per cent of hoses that will wear out before the first maintenance check!) Make a diagram! 9. In problem 8 above, the manufacturer decides that too much money is being spent on maintenance checks. If the manufacturer is willing to accept having 20% of hoses wear out before the fist maintenance check, how many months (to the nearest 100th of a month) can the manufacturer wait until the check? (This is the same as finding the 20th percentile of the distribution) 10. The t test for the difference between the means of 2 independent populations assumes that the respective a) sample sizes are equal. b) sample variances are equal. c) populations are approximately normal. d) all of the above TABLE 10-3 The use of preservatives by food processors has become a controversial issue. Suppose 2 preservatives are extensively tested and determined safe for use in meats. A processor wants to compare the preservatives for their effects on retarding spoilage. Suppose 15 cuts of fresh meat are treated with preservative A and 15 are treated with preservative B, and the number of hours until spoilage begins is recorded for each of the 30 cuts of meat. The results are summarized in the table below. Preservative A Preservative B x A = 106.4 hours s A = 10.3 hours x A = 96.54 hours s B = 13.4 hours 11. Referring to Table 10-3, state the test statistic for determining if the population variance for preservative B is larger than the population variance for preservative A. a) F = 3.100 b) F = 1.300 c) F = 1.693 d) F = 0.591 3 10/24/03 252x0323 12. Referring to Table 10-3, what assumptions are necessary for a comparison of the population variances to be valid? a) Both sampled populations are normally distributed. b) Both samples are random and independent. c) Neither (a) nor (b) is necessary. d) Both (a) and (b) are necessary. TABLE 10-4 A real estate company is interested in testing whether, on average, families in Gotham have been living in their current homes for less time than the families in Metropolis have. A random sample of 100 families from Gotham and a random sample of 150 families in Metropolis yield the following data on length of residence in current homes: Gotham: x G = 35 months, s G2 = 900 Metropolis: x M = 50 months, 2 sM = 1050 13. Referring to Table 10-4, which of the following represents the relevant hypotheses tested by the real estate company? a) H 0 : G – M 0 versus H 1 : G – M 0 b) c) d) H 0 : G – M 0 versus H 1 : G – M 0 H 0 : G – M 0 versus H 1 : G – M 0 H 0 : xG – x M 0 versus H 1 : xG – x M 0 14. Referring to Table 10-4, what is the estimated standard error of the difference between the two sample means? a) 4.00 b) 4.06 c) 5.61 d) 8.01 e) 16.00 15. Referring to Table 10-4, what is (are) the critical value(s) for the test ratio for the relevant hypothesis test if the level of significance is 0.05? a) z = – 1.645 b) z = 1.960 c) z = – 1.960 d) z = – 2.080 16. When testing H 0 : 1 2 0 versus H1 : 1 2 0 , the observed value of the z -score (test ratio) was found to be – 2.13. The p-value for this test would be a) 0.0166. b) 0.0332. c) 0.9668. d) 0.9834. 4 10/24/03 252x0323 TABLE 10-9 A buyer for a manufacturing plant suspects that his primary supplier of raw materials is overcharging. In order to determine if his suspicion is correct, he contacts a second supplier and asks for prices on various materials. He wants to compare these prices with those of his primary supplier. The data collected is presented in the table below, with some summary statistics presented (all of these might not be necessary to answer the questions which follow). The buyer believes that the differences are normally distributed and will use this sample to perform an appropriate test at a level of significance of 0.01. Primary Secondary Material Supplier Supplier Difference 1 $55 $45 $10 2 $48 $47 $1 3 $31 $32 – $1 4 $83 $77 $6 5 $37 $37 $0 6 Sum: Sum of Squares: $55 $54 $1 $309 $292 $17 $15,472 $139 $17,573 17. Referring to Table 10-9, the hypotheses that the buyer should test are a null hypothesis that ________ versus an alternative hypothesis that ________. 18. Referring to Table 10-9, the test to perform is a a) pooled-variance t test for differences in 2 means (D2). b) separate-variance t test for differences in 2 means (D3). c) Wilcoxon signed rank test for differences in 2 medians (D5b). d) t test for mean difference in paired data (D4). e) Wilcoxon-Mann-Whitney test for differences in 2 medians (D5a). 19. Referring to Table 10-9, the number of degrees of freedom is a) 5. b) 10. c) Irrelevant because you are using a rank test. d) Found by a complicated formula 5 10/24/03 252x0323 20. Two brands of gasket are being considered for use on a high pressure oil pump. The number of hours that the gasket worked are as follows. Pump 1 2 3 4 5 Brand 1 Brand 2 x1 x2 2982.28 3025.86 2952.02 2954.64 2981.01 2863.39 2906.97 2873.52 2959.06 2899.98 difference d x1 x 2 118.89 118.89 78.50 -4.42 81.03 Because the data is paired, a test was run using Minitab,(method D4) with the following results MTB > Paired c7 c8; SUBC> Alternative 1. Paired T-Test and CI: brand 1, brand 2 Paired T for brand 1 - brand 2 brand 1 brand 2 Difference N Mean 5 2979.2 5 2900.6 5 78.6 StDev SE Mean 29.7 13.3 37.3 16.7 ____ ___ 95% lower bound for mean difference: 30.6 T-Test of mean difference = 0 (vs > 0): T-Value = 3.49 P-Value = ? Compute the standard deviation of the d column, showing your work, and fill in the blanks in the difference row. You should get a t-ratio approximately equal to the T-Value shown above. Find an approximate p-value, state the hypotheses, find an approximate p-value and tell whether you reject the null hypothesis. (7 points - 2 point penalty for not computing the variance.) 21. Using the means and standard deviations in the computer printout above, repeat the test done by the computer, assuming the brand 1 and brand 2 columns represent independent samples and using a pooled variance. (method D2) . Show your work! (5 points) 6 10/24/03 252x0323 22. (Wonnacott and Wonnacott)A random sample of 7 workers are selected to work under better conditions for a day while 3 others still work under the old conditions. The Wilcoxon procedure for independent samples is used. To test a 1-sided hypothesis W is computed. Output is as follows : Old 44 44 49 New 48 50 51 57 57 61 82 W has the value a) 6 b) 7. c) 10 d) 137 Location - Normal distribution. Compare means. Location - Distribution not Normal. Compare medians. Paired Samples Method D4 Independent Samples Methods D1- D3 Method D5b Method D5a Proportions Method D6 Variability - Normal distribution. Compare variances. Method D7 7 252x0323 10/23/03 ECO252 QBA2 SECOND EXAM October 30, 2003 TAKE HOME SECTION Name: _________________________ Social Security Number: _________________________ II. Neatness Counts! Show your work! 1) To compare two formulations of gasoline, a company picked 7 automobiles and ran each automobile for one week with formulation 1 and for one week with formulation 2 .Miles per gallon appear below. Auto gas 1 1 2 3 4 5 6 7 30.8 34.5 13.2 26.3 26.2 26.2 26.3 gas 2 30.2 34.7 12.6 25.3 25.7 25.0 25.0 Before you start, replace the 0 in 25.0 in the mpg for car 7 with the last digit of your Social Security Number. This number will now be between 25.0 and 25.9. Example: Since my SS number is 265398248, I will change the last 25.0 to 25.8. I got 26.214 for the mean gas consumption for car 1. Make sure that you carry a comparable number of digits in your computations. If gas 1 is the new formulation and gas 2 is the old formulation, the company will switch from the old to the new formulation only if miles per gallon for the new formulation are higher. .01 except when indicated otherwise. a. In order to make our decision, we must do a hypothesis test. What are the null and alternative hypotheses that you are testing? (1) Use the 3 ways below to test the hypotheses. b. Do the appropriate hypothesis test for your hypotheses using a test ratio and find an approximate p-value for the hypothesis. On the basis of your p-value, would we reject the null hypothesis when the significance level is 10%? Why? (3) c. Repeat the test, using a critical value for the difference between the sample means. (2) d. Do an confidence interval for the difference between the two means appropriate to your hypotheses.(2) e. Write a brief report to the product development vice president explaining whether the company should switch to the new formulation and why? (1) 8 252x0323 10/23/03 2) The following data refers to defects in finishing of samples of automobiles made on the various days of the week. Day . No. with Major No. with Minor No. with no Size of Sample Defects Defects Defects Monday 8 22 170 200 Tuesday 2 10 188 200 Wednesday 6 16 178 200 Thursday 2 8 190 200 Friday 10 34 156 200 Before you start, replace the 0 in 10 in the Major Defects column with the last digit of your Social Security Number and reduce 156 by the same amount. Example: Since my SS number is 265398248, I will change the 10 to 18 and then subtract 8 from 156 to get 148.The sum of the row will stay 200. a) Do a statistical test to show if the proportion of cars in the three categories is the same. (4) b) Assuming that you reject your null hypothesis, do a Marascuilo procedure to see which days have proportions of cars with no defects that are significantly different from the others. Note that to do this, you will have to divide the automobiles between those with no defects and those with some defects (which will cut down on degrees of freedom) and then do C25 20 contrasts between proportions. This seems like too much work. Since Friday has the lowest defect rate it should be enough to compare the defect rate on Friday with the defect rate on the other 4 days or the no defect rate on Friday with the no defect rate on the other 4 days. (4) . 3) Extra credit. Assume that the data in problem 1 represents two independent samples and that you are not willing to assume that variances are equal. Test your hypothesis all three ways. (6) 9