Economics 231W, Econometrics University of Rochester Fall 2008 Homework #4 Text Problems. Page 100 – 102: 4.13 4.13. (a) Z = (6 – 6.5) / 0.8 = -0.625 -0.63. Therefore, P(Z -0.63) = 0.2643. Thus, approximately, 264 tubes will contain less than 6 ounces of toothpaste. (b) The cost of the refill will be $52.8 ( = $0.20 x 264). (c) Z = (7 – 6.5) / 0.8 = 0.625 0.63. The probability of Z 0.63 is also 0.2643. Therefore, the profits lost will be $13.2 ( = $0.05 x 264). Pages 127 – 130: 5.2, 5.4, 5.8, 5.13 5.2. (a) The two branches of classical statistics, estimation of parameters and testing hypothesis about parameters, constitute statistical inference. (b) The probability distribution of an estimator. (c) It is synonymous with a confidence interval. (d) A statistic used to decide whether a null hypothesis is rejected or not. (e) That value of the test statistic which demarcates the acceptance region from the rejection region. (f) It is the probability of committing a type I error. (g) The exact level of significance of a test statistic. 5.4. (a) True. In classical statistics the parameter is assumed to be some fixed number, although unknown. ˆ X ) μ X , where μ̂ X is an estimator. (b) False. It is E(μ (c) True. (d) False. To be efficient, an estimator must be unbiased and it must have minimum variance. (e) False. No probabilistic assumption is required for an estimator to be BLUE. (f) True. (g) False. A type I error is when we reject a true hypothesis. (h) False. A type II error occurs when we do not reject a false hypothesis. (i) True. This can be proved formally. (j) False, generally. Only when the sample size increases indefinitely, the sample mean will be normally distributed. If, however, the sample is drawn from a normal population to begin with, the sample mean is distributed normally regardless of the sample size. (k) Uncertain. The p value is the exact level of significance. If the chosen level of significance, say, = 5%, coincides with the p value, the two will mean the same thing. 5.8. (a) 3.182 ( 3 d.f.) (b) 2.353 (3 d.f.) (c) 3.012 (13 d.f.) (d) 2.650 (13 d.f.) (e) 2.0003 (59 d.f.) (f) 1.972 (199 d.f.) Note: These critical values have been obtained from electronic statistical tables. 5.13. Use the t distribution, since the true 2 is unknown. For 9 d.f., the 5% critical t value is 2.262. Therefore, the 95% CI is: 8 2.262(1.2649) = (5.1388, 10.8612) Other Problems 1. It is given that the earned run average (ERA) for all pitchers that started at least 15 games in 2006, follows the normal distribution with a population mean value of 4.55 and a population variance of 0.84; that is ERA ~ N(4.55,0.84). a. What is the probability that a given pitcher will have an ERA less than 2.16 (Francisco Liriano)? Z 2.16 4.55 2.598 0.92 P(Z < -2.60) = P(Z > 2.60) = 1 - .9953 = .0047. b. What is the probability that a given pitcher will have an ERA less than 3.50 (Mike Mussina)? Z 3.5 4.55 1.14 0.92 P(Z < -1.14) = P(Z > 1.14) = 1 – 0.8729 = 0.1271 or 12.71% c. What is the probability that a given pitcher will have an ERA greater than 5.00 (Josh Beckett)? Z 5.00 4.55 0.49 0.92 P(Z > 0.49) = 1 – 0.6879 = 0.3121 or 31.21% d. What is the probability that a pitcher will have an ERA between 3.55 (John Lackey) and 5.55 (Ramon Ortiz)? Z 3.55 4.55 5.55 4.55 1.09 and Z 1.09 0.92 0.92 P(-3.0 < Z < -1.09) = 1 – 0 .8621 = 0.1379 P(-3.0 < Z < 1.09) = 0.8621 P(-1.09 < Z < 1.09) = 0.7242 or 74.42% 2. It turns out that the population mean value of ERA for all pitchers that pitched at least one inning in 2006 is 5.55. For the New York Yankees, the team ERA, for the 25 pitchers on the staff, had a mean value of 5.45 with a sample standard deviation of 2.86. What is the probability of obtaining such an ERA given that the true ERA is 5.55? t 5.45 5.55 0.10 0.175 or 0.175 2.86 0.572 25 From Table A-2, for 25 – 1 = 24 degrees of freedom it appears that the probability of obtaining this sample mean is WELL over 50% (The t-critical value at alpha = .50 is 0.685). 3. For the Baltimore Orioles, the team ERA, for the 24 pitchers on the staff, had a mean value of 7.56 with a sample standard deviation of 4.94. What is the probability of obtaining such an ERA given that the true ERA is 5.55? t 7.56 5.55 2.01 2.01 1.99 4.94 4.94 1.01 4.90 24 From Table A-2, for 24 – 1 = 23 degrees of freedom it appears that the probability of obtaining this sample mean is approximately 5% (The t-critical value at alpha = .05 is 2.069). 4. Suppose we have observations on the return (in percent) for 9 stocks from the NASDAQ: 10.0 a. b. 11.5 15 17.5 15.5 11.0 16.0 18.5 20.0 Calculate the sample mean and sample standard deviation. obs 1 2 3 4 5 6 7 8 9 X 10 11.5 15 17.5 15.5 11 16 18.5 20 mean std dev 15 3.5 Create a 95-percent confidence interval and test the hypothesis that the sample mean calculated above is statistically different from the true mean for returns in the S&P 500 of 11%. The two-sided, t-critical value with α=.05 and 9 – 1 = 8 degrees of freedom is 2.306. P( X 2.306 SX n X X 2.306 SX n ) 0.95 3.5 3.5 P15 2.306 x 15 2.306 0.95 3 3 P15 2.70 X 15 2.70 0.95 P12.3 X 17.7 0.95 H0: µx = 11 H1: µx ≠ 11 Decision Rule: The value µx = 11 does not fall in the 95% confidence interval: REJECT H 0. The true mean value cannot be 11. 5. Assume that nationwide, the average age of nursing home residents is 76 years and is normally distributed. A nursing home administrator in Texas wishes to determine whether the state average differs from the national average. Taking a random sample of 35 residents from nursing homes in Texas, he finds a mean age of 79 and standard deviation of 3.5. Using an alpha level of .05, test the null hypothesis that the sample mean in Texas is equal to the true mean of 76. H0: µx = 76 H1: µx ≠ 76 t 79 76 3 3 5.08 3.5 3.5 .591 5.92 35 The t-critical value with α=.05 and 35-1=34 degrees of freedom is not given in the table from the text. The t-critical value with 30 degrees of freedom is 2.042. Decision Rule: Calculated t = 5.08 > t-critical value = 2.042: REJECT H0, the Texas mean is statistically significantly different from the national average. 76 can not be the true mean value. 6. There is some evidence to suggest that certain herbs can improve human memory. A researcher plans to use a standardized memory test to evaluate the effect of the herbs. Scores on the standardized test form a normal-shaped distribution with a mean of 70. The researcher obtains a sample of 25 people and has each person take an herbal supplement every day for 30 days. At the end of the 30-day period, each person takes the standardized memory test and the mean score for the sample is calculated to be 75 with a standard deviation of 15. Using the .01 level of significance, what decision should the researcher make about the effect of the herbal supplement? That is test whether the mean from the sample is statistically significantly different from the true mean. [Should this be a one-sided or two-sided test?] This should be a one-tailed test. We assume that that herbal supplement makes people smarter, not dumber. H0: µx ≥ 70 H1: µx < 70 t 75 70 5 5 1.67 15 15 3 5 25 The t-critical value with α=.01 and 25-1=24 degrees of freedom is 2.492 Decision Rule: Calculated t = 1.67 < t-critical value = 2.492: FAIL TO REJECT H0, the supplement did not have a statistically significant impact on memory test scores.