252solnA2 1/31/08 (Open this document in 'Page Layout' view!) A. Parameter Estimation 1. Review of the Normal Distribution A1 2. Point and Interval Estimation 3. A Confidence Interval for the Mean when the Population Variance is Known. 4. A Confidence Interval for the Mean when the Population Variance is not Known. A2, text 8.20, 8.50 [8.21, 8.50] (8.21, 8.50) – Answers from both editions will be provided for 8.21. 8.95 on CD (8.93) Graded Assignment 1 (Will be posted) 5. Deciding on Sample Size when working with a Mean A3, 8.38 [8.36] (8.36) 6. A Confidence Interval for a Proportion. Text 8.24, 8.25, 8.26, 8.58, 8.94 on CD [8.22, 8.23, 8.24, 8.58, 8.93a,c on CD] (8.22, 8.23, 8.25, 8.58, 8.91a,c) 7. A Confidence Interval for a Variance. Text 12.1-12.2 [9.72] (9.67), A4 8. (A Confidence Interval for a Median.) Optional - A5 -- solution is posted. ----------------------------------------------------------------------------------------------------------------------------- ---Problems A2 through 8.36 are in this document. Problems involving A Confidence Interval for the Mean PROBLEM A2: If n 64 and x 11 .50 , find 95% confidence intervals for the mean under the following circumstances: a. 6.30, N 3000 b. 6.30 , N 300 c. s 6.30, N 3000 d. s 6.30 , N 300 Solution: Use the formulas from Table 3 of the syllabus supplement or from the outline. a) x z x 11.50 1.960 .7875 11.50 1.54 or 9.96 to 13.04 2 x 6.30 .7875 z 2 z.025 1.960 . More formally, we can say n 64 P9.96 13.04 .95 or make a diagram. b) x z x 11.50 1.960 .6996 11.50 1.37 or 10.13 to 12.87 x 2 x 236 N n 6.30 300 64 0.7875 .8884 .6996 0.7875 299 300 1 N 1 64 n z 2 z.025 1.96 c) x tn1 s x 11.50 1.998 .7875 11.50 1.57 or 9.93 to 13.07 x 2 sx sx n 6.30 64 .7875 t 2 n 1 63 t .025 1998 . 1 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) d) x tn1 s x 11.50 1.998 .6996 11.50 1.40 or 10.10 to 12.90 2 sx sx n N n 6.30 N 1 64 300 64 .6996 300 1 63 tn1 t.025 1.998 2 Exercise 8.21 (8th Edition): Problem asks for 95% confidence intervals for costs of hotels and cars using a sample of 20 cities. It then asks for assumptions about the population needed to assure that interval is valid. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 City Hotel Seattle San Francisco Los Angeles Phoenix Denver Minneapolis Chicago St. Louis Dallas Houston Detroit Cleveland New Orleans Pittsburgh Atlanta Boston New York Washington Orlando Miami Sum a) For hotels x x 1 3297 , Car x1 x2 174 194 188 120 121 142 198 125 179 138 130 129 117 149 144 275 311 216 124 123 3297 46 48 39 32 40 47 53 48 51 48 49 43 45 44 50 39 72 52 36 31 913 x 2 1 x12 x 22 30276 37636 35344 14400 14641 20164 39204 15625 32041 19044 16900 16641 13689 22201 20736 75625 96721 46656 15376 15129 598049 2116 2304 1521 1024 1600 2209 2809 2304 2601 2304 2401 1849 2025 1936 2500 1521 5184 2704 1296 961 43169 598049 and n 20. So x x 3297 164 .85 , n 20 598049 20 164 .85 54538 .55 2870 .45 , and s 2870.45 53.5766 n 1 19 19 We use the formula for a confidence interval when the population standard deviation is unknown. s 53 .5766 19 sx x 11 .9801 , t n1 t .025 2.093 , so 2 20 n x t n1 s 164 .85 2.093 11.9801 164 .85 25.07 or 139.78 to 189.92. s2 2 2 nx 2 x x b) For car rentals s2 x 2 2 nx 2 913 , x 2 2 43169 and n 20. So x 2 78 .4500 , and s 8.8572 . So s x n 1 x tn1 s x 45.65 4.15 or 41.50 49.80 sx n x 913 45.65 , n 8.8572 20 20 19 1.9805 , t n1 t .025 2.093 , so 2 2 c) According to the Instructor’s Solution Manual we need to assume that the population has a normal distribution since the sample size of 20 is not large enough to invoke the central limit theorem for distributions that are not symmetric. 2 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) Exercise 8.21 (9th Edition): ): Problem asks for 95% confidence intervals for costs of hotels and cars using a sample of 20 cities. It then asks for assumptions about the population needed to assure that interval is valid. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 City Seattle San Francisco Los Angeles Phoenix Denver Minneapolis Chicago St. Louis Dallas Houston Detroit Cleveland New Orleans Pittsburgh Atlanta Boston New York Washington Orlando Miami Sum a) For hotels x x 1 Hotel Car x1 x2 176 178 223 124 139 167 257 159 167 180 141 145 142 148 173 243 273 262 133 116 3546 45 42 36 38 38 53 51 53 46 48 53 40 49 49 46 46 69 47 40 39 928 3546 , x 2 1 x12 30976 31684 49729 15376 19321 27889 66049 25281 27889 32400 19881 21025 20164 21904 29929 59049 74529 68644 17689 13456 672864 x 22 2025 1764 1296 1444 1444 2809 2601 2809 2116 2304 2809 1600 2401 2401 2116 2116 4761 2209 1600 1521 44146 672864 and n 20. So x x 3546 177 .30 , n 20 672864 20 177 .3 44158 .2 2324 .125 , and s 2324.125 48.2091 n 1 19 19 We use the formula for a confidence interval when the population standard deviation is unknown. s 48 .2091 19 sx x 10 .7799 , t n1 t .025 2.093 , so 2 n 20 x t n1 s 177 .305 2.093 10.7799 177 .30 22.56 or 154.74 to 199.86. s2 2 2 nx 2 x x b) For car rentals s2 x 2 2 nx 2 928 , x 2 2 44146 and n 20. So x 2 57 .2000 , and s 7.5631 . So s x n 1 x tn1 s x 46.40 3.54 or 42.86 49.94 sx n x 928 46.40 , 7.5631 20 n 20 19 1.6912 , t n1 t .025 2.093 , so 2 2 c) According to the Instructor’s Solution Manual we need to assume that the population has a normal distribution since the sample size of 20 is not large enough to invoke the central limit theorem for distributions that are not symmetric. Given the approach taken by the text to checking this, I decided to run some graphs on Minitab. The results follow on the next page. 3 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) MTB > boxplot hotel Boxplot of Hotel MTB > pplot hotel Probability Plot of Hotel MTB > boxplot car Boxplot of Car MTB > pplot car 4 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) Probability Plot of Car The Minitab ‘help’ menu gives the following information for understanding probability plots The y-axis, and sometimes the x-axis, of a probability plot are transformed so that the fitted distribution (center blue line) forms a straight line. If the distribution fits your data: The plotted points will roughly form a straight line. The plotted points will fall close to the fitted line. The Anderson-Darling statistic will be small, and the associated p-value will be larger than your chosen -level. (Commonly chosen levels for include 0.05 and 0.10.) Minitab also displays approximate 95% confidence intervals (curved blue lines) for the fitted distribution. These confidence intervals are point-wise, meaning that they are calculated separately for each point on the fitted distribution without controlling for family-wise error. Usually, points outside the confidence intervals occur in the tails. In the lower half of the plot, points to the right of the confidence band indicate that there are fewer data in the left tail than one would expect based on the fitted distribution. In the upper half, points to the right of the confidence band indicate that there are more data in the right tail than one would expect. I’m not going to get into how the Anderson-Darling statistic is calculated, but the p-value on the graph should be above .05 if the distribution does not differ from Normal. In the case of ‘Hotel,’ the two graphs together seem to indicate definite skewness to the right. The results from ‘Car’ are much less clear, since the trouble seems to be caused by one pesky point and the p-value is relatively large. We can say that the confidence interval for ‘Hotel’ is not reliable and that we are not sure about ‘Car.’ 5 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) Exercise 8.20 (10th Edition) (I should have assigned 8:22) : a) Asks for a 95% confidence interval for the mean number of days it takes to resolve a complaint. Days x1 54 5 35 137 31 27 152 2 123 81 74 27 11 19 126 110 110 29 61 35 94 31 26 5 12 4 165 32 29 28 29 26 25 1 14 13 13 10 5 27 4 52 30 22 36 26 20 23 33 68 2152 Days Squared x12 2916 25 1225 18769 961 729 23104 4 15129 6561 5476 729 121 361 15876 12100 12100 841 3721 1225 8836 961 676 25 144 16 27225 1024 841 784 841 676 625 1 196 169 169 100 25 729 16 2704 900 484 1296 676 400 529 1089 4624 178754 For days x 1 n 50. So x s2 x 2 2152 , x 2 1 178754 and x 2152 43.04 , nx n 2 50 178754 50 43 .04 2 49 n 1 86131 .92 1757 .794286 , and 49 s 1757.794286 41.92606 We use the formula for a confidence interval when the population standard deviation is s 41 .92606 unknown. s x x n 50 1757 .794286 35 .15589 5.9292 50 t n1 t 49 2.010 , so 2 .025 x tn1 s x 2 43.04 2.010 5.9292 43.04 11.92 or 31.12 to 56.96. b) The problem asks what assumption you should make in a). The answer is that the distribution should be Normally distributed. c) The problem asks if the assumptions are seriously violated. A Normal probability plot is displayed which shows that the distribution is not Normal. Better, note that the median is 28.50, Q1 = 13.75 and Q3 = 55.75. Since the mean is above the median and the distance between the median and Q3 is larger than the distance between the mean and Q1, we can conclude that the distribution is skewed to the right. d) The problem asks if this is a serious problem. It is not because the sample is well above 30, so that x is approximately Normally distributed. Graphical Checks for Normality. Since the Instructors Solutions Manual used graphs to check for Normality, I have followed suit. The column of data was placed in C1 and the commands were enabled. The two commands used were Boxplot 'days' and pplot ‘days’. Results appear on the next page. 6 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) MTB > Boxplot 'days' Boxplot of days MTB > pplot 'days' Probability Plot of days In the boxplot, skewness is shown by the spread out values above the median. In the (Normal) probability plot, the deviation of the points from a straight line reveals that the distribution is not Normal. 7 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) Exercise 8.50: Problem asks for a 99% confidence interval for the population total using given data. We are told that n 25 , N 500 , x 25 .7 , s 7.8 and .01 . Solution: Since the population size is exactly 20 times the sample size, the Instructor’s Solution Manual uses a finite population correction, though it is not really necessary. We use the formula for the confidence interval when the population standard deviation is not known. s N n 7.8 500 25 24 1.5220 sx x t n1 t.005 2.797 2 N 1 500 1 n 25 x t n1 s 25.7 2.797 1.5220 25.7 4.26 or 21.44 to 29.96. If we want a confidence interval 2 x for the population total, we can multiply our results by N 500 . If you really need an equation, you can write N Nx Nt n1 s . If we just multiply the numbers by 500, we get 10720 to 14980. The 2 x Instructor’s Solution Manual, using a slightly more accurate value of t , gets N x N t s n N –n 7.8 500 – 25 or $10,721.53 Population Total 500 25 .7 500 2.7969 N –1 500 – 1 25 $14,978.47. Exercise 8.96 on CD [8.95 in 9th edition] (8.93 in 8th Edition): A stationary store wants to estimate the mean retail value of greeting cards that it has in its 300-card inventory. (This version uses a corrected value of t . ) a. Set up a 95% confidence interval estimate of the population mean value of all greeting cards that are in its inventory if a random sample of 20 greeting cards selected without replacement indicates an average value of $1.67 and a standard deviation of $0.32. b. What is your answer to (a) if the store has 500 greeting cards in its inventory. Solution: a) Since n 20 is more than 5% of N 300 , we need a finite population correction factor. N n 0.32 300 20 19 .07155 .9364548 .07155 .9677 0.069243 t n1 t .025 2.093 2 n N 1 20 300 1 x tn1 s x 1.67 2.093 0.069243 1.67 0.15 or $1.52 to $1.82. The 10th edition answer book sx sx 2 gets $1.53 to $1.81. b) Since n 20 is not more than 5% of N 500 , we do not need a finite population correction factor. s 0.32 19 sx x .07155 t n1 t .025 2.093 2 n 20 x t n1 s 1.67 2.093 0.07155 1.67 0.15 or $1.52 to $1.82. 2 x If we are extremely cautious and use the finite population correction factor anyway, N n 0.32 500 20 19 .07155 .9619238 .07155 .9808 0.070175 t n1 t .025 2.093 2 n N 1 20 500 1 x tn1 s x 1.67 2.093 0.070175 1.67 0.15 or $1.52 to $1.82. sx sx 2 Check these results – there is no answer given by the author! 8 252solnA2 1/31/08 (Open this document in 'Page Layout' view!) PROBLEM A3: In a study of a grain market in an African country we want to figure out how large a sample we must take to find a daily average price for a grain transaction. (Assume a standard deviation of 5 cents.) a. We want a 99% confidence interval for the mean with an error of ±1 cent. b. What if the error is to be ±1/2 cent? z 2 2 , where z z z.005 since .01 . 2 e2 a) We are told that the maximum error must be e 1 (or e .01 ) and that 5 (or .05 ). Solution: We use the formula n From the t table, z.005 2.576 so that n z 2 2 2.576 2 52 12 165 .89 . since we always e2 round this quantity up, use a sample size of at least 166. Note that if we use n 165 , we find that 5 90 2.576 90 1.003 . The error term will (if we assume that x 90 ) x z 2 n 165 5 90 2.576 90 1.000 . be slightly above 1. However, if we use n 166 , x z 2 n 166 b) This time the maximum allowable error is e 0.5 , so n z 2 2 2.576 2 52 0.52 663 .57 and e2 we must use a sample size of 664. Note that this sample size is four times the size in part a. Exercise 8.38 (8.36 in 8th and 9th edition): a) The problem wants a bound e 50 on the error term, when 400 . b) It now asks for the same result with a bound on the error term of e 25 Solution: We use the formula for n , n z 2 2 z 2 2 e2 , and since .05 , z z z.025 1.960 . 2 1.960 2 400 2 50 2 245 .862 . Since this quantity is always rounded up, we use a e2 sample size of 246 or more. b) The only change is that now e 25 . On the basis of problem A3b), we expect our answer will be four a) e 50 and n times as large. n z 2 2 e2 1.960 2 400 2 25 2 983 .450 . Use n = 984 or larger. 9