CHAPTER 5 Sampling Distributions 5.1 The possible values of p^ are 0, 1/3, 2/3, and 1. These correspond to getting 0 persons with lung cancer, 1 with lung cancer, 2 with lung cancer, and all 3 with lung cancer. •5.2 (a) Pr{p^ = 0} = Pr{no mutants} = Pr{all are non-mutants} = (1 - .39)3 = .227. (b) Pr{p^ = 1/3} = Pr{1 mutant} = 3C1p1(1 - p)2, where p = .39. This is = (3)(.391)(.612) = .435. 5.3 (a) (i) .08; (ii) .27; (iii) .35; (iv) .22; (v) .07; (vi) .01 (b) Probability .40 .30 .20 .10 0 .2 .4 .6 .8 1.0 ^p •5.4 We are concerned with the sampling distribution of p^ , which is governed by a binomial distribution. Letting "success" = "responder," we have p = .2 and 1 - p = .8. The number of trials is n = 15. (a) The event p^ = .2 occurs if there are 3 successes in the 15 trials (because 3/15 = .2). Thus, to find the probability that p^ = .2, we can use the binomial formula nCjpj(1 - p)n - j with j = 3, so n - j = 12: Pr{p^ = .2} = 15C3p3(1 - p)12 = (455)(.23)(.812) = .2501. (b) The event p^ = 0 occurs if there are 0 successes in the 15 trials (because 0/15 = 0). Thus, to find the probability that p^ = 0, we can use the binomial formula with j = 0, so n - j = 15: Pr{p^ = 0} = 15C0p0(1 - p)15 = (1)(1)(.815) = .0352. •5.5 (a) Letting "success" = "infected," we have p = .25 and 1 - p = .75. The number of trials is n = 4. We then use the binomial formula nCjpj(1 - p)n - j with n = 4 and p = .25. The values of p^ correspond to numbers of successes and failures as follows: p^ 0 = 0/4 .25 = 1/4 .50 = 2/4 .75 = 3/4 1 = 4/4 Number of successes (j) 0 1 2 3 4 Number of failures (n - j) 4 3 2 1 0 65 Thus, we find (i) Pr{p^ = 0} (ii) Pr{p^ = .25} (iii) Pr{p^ = .50} (iv) Pr{p^ = .75} (v) Pr{p^ = 1} = = = = = 4C0p0(1 - p)4 4C1p1(1 - p)3 4C2p2(1 - p)2 4C3p3(1 - p)1 4C4p4(1 - p)0 = = = = = (1)(1)(.754) (4)(.25)(.753) (6)(.252)(.752) (4)(.253)(.75) (1)(.254)(1) (b) The distribution is displayed in the following histogram: .50 Probability .40 .30 .20 .10 0 .25 .5 .75 1.0 ^p 5.6 (a) p^ .000 .125 .250 .375 .500 .625 .750 .875 1.000 Probability .758 (8)(.25)(.757) (28)(.252)(.756) (56)(.253)(.755) (70)(.254)(.754) (56)(.255)(.753) (28)(.256)(.752) (8)(.257)(.75) .258 = = = = = = = = = .1001 .2670 .3115 .2076 .0865 .0231 .0039 .0004 .0000 = = = = = .3164 .4219 .2109 .0469 .0039 66 (b) .50 Probability .40 .30 .20 .10 0 .25 .5 ^p .75 1.0 0 .25 .5 ^p .75 1.0 .40 Probability .30 .20 .10 The distribution for n = 8 is narrower than the distribution for n = 4. 5.7 (a) (252)(.65)(.45) = .2007 (b) (210)(.66)(.44) = .2508 (c) (120)(.67)(.43) = .2150 (d) .2007 + .2508 + .2150 = .6665 (e) .6665 (from part (d)) 5.8 (a) p^ .0 .2 .4 .6 .8 1.0 Probability .75 (5)(.3)(.74) (10)(.32)(.73) (10)(.33)(.72) (5)(.34)(.7) .35 = = = = = = .1681 .3602 .3087 .1323 .0284 .0024 67 (b) Probability .40 .30 .20 .10 0 .2 .4 .6 .8 1.0 ^p Compared with Figure 5.4, this distribution is more spread out (more dispersed) and is more skewed. •5.9 Because p = .40, the event E occurs if p^ is within ± .05 of .40; this happens if there are 7, 8, or 9 successes, as follows: Number of successes (j) 7 8 9 p^ .35 .40 .45 We can calculate the probabilities of these outcomes using the binomial formula with n = 20 and p = .4: Pr{p^ = .35} = 20C7p7(1 - p)13 = (77,520)(.47)(.613) = .1659 Pr{p^ = .40} = 20C8p8(1 - p)12 = (125,970)(.48)(.612) = .1797 Pr{p^ = .45} = 20C9p9(1 - p)11 = (167,960)(.49)(.611) = .1597 Finally, we calculate Pr{E} by adding these results: Pr{E} = .1659 + .1797 + .1597 = .5053. 5.10 The sample percentage, p^ , of students who smoke varies from one sample to the next. The sampling distribution of the sample percentage is the distribution of p^ -- the proportion of smokers in a sample -across repeated samples. That is, the sampling distribution of the sample percentage is the distribution of sample percentages of smokers in samples of size 10. 5.11 - 5.13 See Section III of this Manual. 5.14 Under the proposed sampling scheme, the chance that an ellipse will be selected is proportional to its area. The scheme is biased toward larger ellipses, and will thus tend to produce a y– that is too large. 68 •5.15 (a) In the population, µ = 176 and σ = 30. For y = 186, y - µ 186 - 176 z= = = .33. 30 σ From Table 3, the area below .33 is .6293. For y = 166, y - µ 166 - 176 = = -.33. z= 30 σ From Table 3, the area below -.33 is .3707. Thus, the percentage with 166 ≤ y ≤ 186 is .6293 - .3707 = .2586, or 25.86%. – (b) We are concerned with the sampling distribution of Y for n = 9. From Theorem 5.1, the mean of the – sampling distribution of Y is µ - = µ = 176, Y the standard deviation is 30 σ = = 10, σ- = Y n 9 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). We need to find the shaded area in the figure. – For y = 186, –y - µ Y 186 - 176 = = 1.00. z= 10 σY From Table 3, the area below 1.00 is .8413. For –y = 166, –y - µ Y 166 - 176 = = -1.00. z= 10 σY From Table 3, the area below -1.00 is .1587. Thus, the percentage with 166 ≤ –y ≤ 186 is .8413 - .1587 = .6826, or 68.26%. (c) The probability of an event can be interpreted as the long-run relative frequency of occurrence of the event (Section 3.3). Thus, the question in part (c) is just a rephrasing of the question in part (b). It follows from part (b) that – Pr{166 ≤ Y ≤ 186} = .6826. •5.16 (a) µ = 3000; σ = 400. – The event E occurs if Y is between 2900 and 3100. We are concerned with the sampling distribution of – – Y for n = 15. From Theorem 5.1, the mean of the sampling distribution of Y is µ - = µ = 3000, Y the standard deviation is 69 400 σ σ- = = = 103.3, Y n 15 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). For –y = 3100, –y - µ Y 3100 - 3000 z= = = .97. 103.3 σY From Table 3, the area below .97 is .8340. For –y = 2900, –y - µ Y 2900 - 3000 = = -.97. z= 103.3 σY From Table 3, the area below -.97 is .1660. – Thus, Pr{2900 ≤ Y ≤ 3100} = Pr{E} = .8340 - .1660 = .6680. (b) n = 60; σ - = 400/ 60 = 51.64 Y ±100 z = 51.64 = ±1.94; Table 3 gives .9738 and .0262, so Pr{E} = .9738 - .0262 = .9476. (c) As n increases, Pr{E} increases. 5.17 σ - = 400/ 15 = 103.3 Y 2900 - 2800 (a) z = = .97. From Table 3, the area below .97 is .8340. 103.3 z= 2700 - 2800 = -.97. From Table 3, the area below -.97 is .1660. 103.3 Thus, Pr{E} = .8340 - .1660 = .6680 (b) z = z= 2700 - 2600 = .97. From Table 3, the area below .97 is .8340. 103.3 2500 - 2600 = -.97. From Table 3, the area below -.97 is .1660. 103.3 Thus, Pr{E} = .8340 - .1660 = .6680 (c) For fixed n and σ, Pr{E} does not depend on µ. 5.18 µ = 145; σ = 22. 155 - 145 (a) z = = .45; Table 3 gives .6736. 22 135 - 145 = -.45; Table 3 gives .3264. 22 Thus, .6736 - .3264 = .3472 or 34.72% of the plants. z= 70 (b) n = 16; σ - = 22/ 16 = 5.5. Y 155 - 145 z= = 1.82; Table 3 gives .9656. 5.5 z= 135 - 145 = -1.82; Table 3 gives .0344. 5.5 Thus, .9656 - .0344 = .9312 or 93.12% of the groups. (c) Pr{135 ≤ Y ≤ 155} = .9312 (from part (b)). (d) n = 36; σ - = 22/ 36 = 3.67. Y 155 - 145 z = 3.67 = 2.72; Table 3 gives .9967. z= 135 - 145 = -2.72; Table 3 gives .0033. 3.67 Thus, .9967 - .0033 = .9934 or 99.34% of the groups. 5.19 (a) σ - = 1.4/ 25 = .28; Y 5 - 4.2 z = .28 = 2.86; Table 3 gives .9979. z= 4 - 4.2 .28 = -.71; Table 3 gives .2389. Pr{4 ≤ Y ≤ 5} = .9979 - .2389 = .7590. (b) The answer is approximately correct because the Central Limit Theorem says that the sampling distribution of Y is approximately normal if n is large. The same approach is not valid for n = 2, because the Central Limit Theorem does not apply when the sample size is small. •5.20 (a) In the population, 65.68% of the fish are between 51 and 60 mm long. To find the probability that four randomly chosen fish are all between 51 and 60 mm long, we let "success" be "between 51 and 60 mm long" and use the binomial distribution with n = 4 and p = .6568, as follows: Pr{all 4 are between 51 and 60} = 4C4p4(1 - p)0 = (1).65684(1) = .1861. (b) The mean length of four randomly chosen fish is Y . Thus, we are concerned with the sampling distribution of Y for a sample of size n = 4 from a population with µ = 54 and σ = 4.5. From Theorem 5.1, the mean of the sampling distribution of Y is µ - = µ = 54, Y the standard deviation is 4.5 σ = = 2.25, σ- = Y n 4 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). 71 For –y = 60, –y - µ Y 60 - 54 = 2.25 = 2.67. z= σY From Table 3, the area below 2.67 is .9962. For –y = 51, –y - µ Y 51 - 54 = 2.25 = -1.33. z= σY From Table 3, the area below -1.33 is .0918. – Thus, Pr{51 ≤ Y ≤ 60} = .9962 - .0918 = .9044. 5.21 Let E1 be the event that all four fish are between 51 and 60 mm long and let E2 be the event that Y is between 51 and 60 mm long. If E1 occurs, then E2 must also occur -- the mean of four numbers, each of which is between 51 and 60, must be between 51 and 60 -- but E2 can occur without E1 occurring. Thus, in the long run, E2 will happen more often than E1, which shows that Pr{E2} > Pr{E1}. 5.22 µ - = 50 and σ - = σ/ n = 9/ n Y Y An area of .68 corresponds to ±1 on the z scale; therefore 51.1 - 50 z = 1.0 = 9/ n which yields n = 36. •5.23 The distribution of repeated assays of the patient's specimen is a normal distribution with mean µ = 35 (the true concentration) and standard deviation σ = 4. (a) The result of a single assay is like a random observation Y from the population of assays. A value Y ≥ 40 will be flagged as "unusually high." For y = 40, y - µ 40 - 35 = 4 = 1.25. z= σ From Table 3, the area below 1.25 is .8944, so the area beyond 1.25 is 1 - .8944 = .1056. Thus, Pr{specimen will be flagged as "unusually high"}= .1056. (b) The reported value is the mean of three independent assays, which is like the mean Y of a sample of size n = 3 from the population of assays. A value Y ≥ 40 will be flagged as "unusually high." We are concerned with the sampling distribution of Y for a sample of size n = 3 from a population with mean µ = 35 and standard deviation σ = 4. From Theorem 5.1, the mean of the sampling distribution of Y is µ - = µ = 35, Y the standard deviation is 72 4 σ σ- = = = 2.309, Y n 3 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). For –y = 40, –y - µ Y 40 - 35 z= = 2.309 = 2.17. σY From Table 3, the area below 2.17 is .9850, so the area beyond 2.17 is 1 - .9850 = .0150. Thus, Pr{mean of three assays will be flagged as "unusually high"}= 1 - .9850 = .0150. 5.24 (a) µ - = µ = 41.5. Y (b) σ - = 4.7/ 4 = 2.35 Y 5.25 (a) Because the sample size of 2 is small, we would expect the histogram of the sample means to be skewed to the right, as is the histrogram of the data. However, the histogram of the sample means will be somewhat symmetric (more so than the histogram of the data). (b) Because the sample size of 25 is fairly large, we would expect the histogram to have a bell shape. 5.26 The sample mean is just an individual observation when n=1. Thus, the histogram of the sample means will be the same as the histogram of the data (and therefore be skewed to the right). 5.27 No. The histogram shows the distribution of observations in the sample. Such a distribution would look more like the population distribution for n = 400 than for n = 100, and the population distribution is apparently rather skewed. The Central Limit Theorem applies to the sampling distribution of Y , which is not what is shown in the histogram. 5.28 µ - = 38 and σ - = 9/ 25 = 1.8. Y Y (a) z = 36 - 38 1.8 = -1.11. Table 3 gives .1335, so Pr{Y > 36} = 1 - .1335 = .8665. (b) z = 41 - 38 1.8 = 1.67. Table 3 gives .9525, so Pr{Y > 41} = 1 - .9525 = .0475. •5.29 For each thrust, the probability is .9 that the thrust is good and the probability is .1 that the thrust is fumbled. Letting "success" = "good thrust," and assuming that the thrusts are independent, we apply the binomial formula with n = 4 and p = .9. (a) The area under the first peak is approximately equal to the probability that all four thrusts are good. To find this probability, we set j = 4; thus, the area is approximately 4C4p4(1 - p)0 = (1)(.94)(1) = .66. (b) The area under the second peak is approximately equal to the probability that three thrusts are good and one is fumbled. To find this probability, we set j = 3; thus, the area is approximately 73 4C3p3(1 - p)1 = (4)(.93)(.1) = .29. 5.30 (a) 1 The first peak is at 115, the second peak is at 2(115 + 450) = 282.5, and the third peak is at 450. (b) First peak: .92 = .81 Second peak: (2)(.9)(.1) = .18 Third peak: .12 = .01 5.31 When n=1 the sample mean is just an individual observation. Thus, the sampling distribution of the sample mean is the same as the distribution of the individual time scores, as shown in Figure 5.15. There are two peaks, one at 115 ms and one at 450 ms. 74 •5.32 Letting "success" = "heads," the probability of ten heads and ten tails is determined by the binomial distribution with n = 20 and p = .5. (a) We apply the binomial formula with j = 10: Pr{10 heads, 10 tails} = 20C10p10(1 - p)10 = (184,756)(.510)(.510) = .1762. (b) According to part (a) of Theorem 5.2, the binomial distribution can be approximated by a normal distribution with mean = np = (20)(.5) = 10 and standard deviation = np(1 - p) = (20)(.5)(.5) = 2.236. Applying continuity correction, we wish to find the area under the normal curve between 10 - .5 = 9.5 and 10 + .5 = 10.5. The desired area is shaded in the figure. The boundary 10.5 corresponds to 10.5 - 10 z = 2.236 = .22. From Table 3, the area below .22 is .5871. The boundary 9.5 corresponds to 9.5 - 10 z = 2.236 = -.22. From Table 3, the area below -.22 is .4129. 75 Thus, the normal approximation to the binomial probability is Pr{10 heads, 10 tails} ≈ .5871 - .4129 = .1742. 5.33 Letting "success" = "type O blood," the probability that 6 of the persons will have type O blood is determined by the binomial distribution with n = 12 and p = .44. (a) We apply the binomial formula with j = 6: Pr{6 type O blood} = 12C6p6(1 - p)6 = (924)(.446)(.566) = .2068. (b) According to part (a) of Theorem 5.2, the binomial distribution can be approximated by a normal distribution with mean = np = (12)(.44) = 5.28 and standard deviation = np(1 - p) = (12)(.44)(.56) = 1.72. Applying continuity correction, we wish to find the area under the normal curve between 6 - .5 = 5.5 and 6 + .5 = 6.5. 5.5 − 5.28 6.5 − 5.28 <Z< } = Pr{.13 < Z < .71} = .7580 - .5517 = 2063. Thus, Pr{6 type O blood} ≈ Pr{ 1.72 1.72 •5.34 (a) Because p = .12, the event that p^ will be within ±.03 of p is the event .09 ≤ p^ ≤ .15, which, if n = 100, is equivalent to the event 9 ≤ number of success ≤ 15. Letting "success" = "oral contraceptive user," the probability of this event is determined by the binomial distribution with mean = np = (100)(.12) = 12 and standard deviation = np(1 - p) = (100)(.12)(.88) = 3.250. Applying continuity correction, we wish to find the area under the normal curve between 9 - .5 = 8.5 and 15 + .5 = 15.5. The desired area is shaded in the figure. The boundary 15.5 corresponds to 15.5 - 12 z = 3.250 = 1.08. From Table 3, the area below 1.08 is .8599. The boundary 8.5 corresponds to 8.5 - 12 z = 3.250 = -1.08. From Table 3, the area below -1.08 is .1401. Thus, the normal approximation to the binomial probability is Pr{p^ will be within ±.03 of p} ≈ .8599 - .1401 = .7198. (Note: An alternative method of solution is to use part (b) of Theorem 5.2 rather than part (a). Such a method is illustrated in the solutions to Exercises 5.30 and 5.41.) (b) With n = 200, p^ is within ±.03 of p if and only if the number of successes is between (200)(.09) = 18 and (200)(.15) = 30. The mean is (200)(.12) = 24 and the standard deviation is (200)(.12)(.88) = 4.60. Applying continuity correction, we wish to find the area under the normal curve between 18 - .5 = 17.5 and 30 + .5 = 30.5. 76 30.5 - 24 4.60 = 1.41; Table 3 gives .9207. 17.5 - 24 z = 4.60 = -1.41; Table 3 gives .0793. .9207 - .0783 = .8414. z= 5.35 (b) p = .5 For n = 45, 60% boys means 27 boys. For the normal approximation to the binomial, the mean is np = (45)(.5) = 33.5 and the SD is np(1 - p) = (45)(.5)(.5) = 3.354. z= 27 - 22.5 3.354 = 1.34; Table 3 gives .9099. 1 - .9099 = .0901. For n = 15, 60% boys means 9 boys. For the normal approximation to the binomial, the mean is np = (15)(.5) = 7.5 and the SD is np(1 - p) = (15)(.5)(.5) = 1.936. 9 - 7.5 z = 1.936 = .77; Table 3 gives .7794. 1 - .7794 = .2206. In the larger hospital, 9% of days have 60% or more boys. In the smaller hospital, 22% of days have 60% or more boys. The smaller hospital recorded more such days. 5.36 p ±.05 is .25 to .35. The normal approximation to the sampling distribution of p^ has mean p = .3 and standard deviation p(1 - p) (.3)(.7) = n 400 = .02291. .35 - .3 z = .02291 = 2.18; Table 3 gives .9854. .25 - .3 z = .02291 = -2.18; Table 3 gives .0146. .9854 - .0146 = .9708. •5.37 (a) Because p = .3, the event E, that p^ will be within ±.05 of p, is equivalent to .25 ≤ p^ ≤ .35. The sample size is n = 40. According to part (b) of Theorem 5.2, the sampling distribution of p^ can be approximated by a normal distribution with mean p = .3 and p(1 - p) (.3)(.7) = standard deviation = n 40 = .07246. To apply continuity correction, we first calculate the half-width of a histogram bar (on the p^ scale) as 1 1 (2 )(40 ) = .1025. 77 Thus, we wish to find the area under the normal curve between .25 - .0125 = .2375 and .35 + .0125 = .3625. The desired area is shaded in the figure. The boundary .3625 corresponds to .3625 - .25 z = .07246 = .86. From Table 3, the area below .86 is .8051. The boundary .2375 corresponds to .2375 - .25 z = .07246 = -.86. From Table 3, the area below -.86 is .1949. Thus, the normal approximation to the probability is Pr{E} ≈ .8051 - .1949 = .6102. (Note: An alternative method of solution is to use part (a) of Theorem 5.2 rather than part (b). Such a method is illustrated in the solution to Exercise 5.27.) .35 - .25 (b) z = .07246 = .69; Table 3 gives .7549. .25 - .25 z = .07246 = -.69; Table 3 gives .2451. .7549 - .2451 = .5098. 1 9 5.38 Let E be the event that p^ is closer to 2 than to 16 . (a) n = 1. E occurs if the number of purple plants is 0. Pr{E} = 7/16 = .4375. (b) n = 64. E occurs if the number of purple plants is less than or equal to 33. The normal approximation to the binomial has mean np = (64)(9/16) = 36 and standard deviation np(1 - p) = (64(9/16)(7/16)) = 3.969. 33 - 36 z = 3.969 = -.76; Table 3 gives .2236 = Pr{E}. (c) n = 320. E occurs if the number of purple plants is less than or equal to 169. The normal approximation to the binomial has mean np = (320)(9/16) = 180 and standard deviation np(1 - p) = (320(9/16)(7/16)) = 8.874. 169 - 180 8.874 = -1.24; Table 3 gives .1075 = Pr{E}. 5.39 (a) Pr{3 heads} = (120)(.53)(.57) = .1172. Pr{4 heads} = (210)(.54)(.56) = .2051. z= .1172 + .2051 = .3223. (b) For the normal approximation to the binomial, the mean is np = (10)(.5) = 5 and the SD is np(1 - p) = (10(.5)(.5)) = 1.581. 4.5 - 5 z = 1.581 = -.32; Table 3 gives .3745. 2.5 - 5 z = 1.581 = -1.58; Table 3 gives .0571. 78 .3745 - .0571 = .3174. 5.40 For the normal approximation to the binomial, the mean is np = (100)(.8) = 80 and the SD is np(1 - p) = (100(.8)(.2)) = 4. (a) z = 85 - 80 = 1.25; Table 3 gives .8944. 4 1 - .8944 = .1056. (b) z = 84.5 - 80 = 1.13; Table 3 gives .8708. 4 1 - .8708 = .1292. 5.41 For the normal approximation to the binomial, the mean is np = (50)(.8) = 40 and the SD is np(1 - p) = (50(.8)(.2)) = 2.83. 35 - 40 (a) z = 2.83 = -1.77; Table 3 gives .0384. (b) z = 35.5 - 40 2.83 = -1.59; Table 3 gives .0559. •5.42 µ = 88; σ = 7. – We are concerned with the sampling distribution of Y for n = 5. From Theorem 5.1, the mean of the – sampling distribution of Y is µ - = µ = 88, Y the standard deviation is 7 σ = = 3.13, σ- = Y n 5 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). For –y = 90, –y - µ Y 90 - 88 = 3.13 = .64. z= σY From Table 3, the area below .64 is .7389. – Thus, Pr{Y > 90} = 1 - .7389 = .2611. 5.43 (a) (45)(.838)(.172) = .2929 (b) (10)(.839)(.17) = .3178 72 - 69.7 = .82; Table 3 gives .8939. 2.8 1 - .8939 = .2061. (b) (i) Using the binomial distribution, Pr{both are > 72} = .20612 = .0425. (ii) n = 2; σ - = σ/ n = 2.8/ 2 = 1.980. Y 5.44 (a) z = 79 72 - 69.7 z = 1.980 = 1.16; Table 3 gives .8770. 1 - .8770 = .1230 5.45 µ = 800; σ = 90. 850 - 900 (a) z = = .56; Table 3 gives .7123. 90 750 - 900 = -.56; Table 3 gives .2877. z= 90 .7123 - .2877 = .4246 or 42.46% of the plants. (b) n = 4; σ - = σ/ n = 90/ 4 = 45. Y 850 - 900 z= = 1.11; Table 3 gives .8665. 45 750 - 900 = -1.11; Table 3 gives .1335. z= 45 .8665 - .1335 = .7330 or 73.30% of the groups will have means in this range. 5.46 Two possible factors are: (a) environmental variation from one pot (or location) to another; (b) competition between plants in a pot (for instance, overlapping leaves). •5.47 We are concerned with the sampling distribution of p^ , which is governed by a binomial distribution. Letting "success" = "adult," we have p = .2 and 1 - p = .8. The number of trials is n = 20. (a) The event p^ = p occurs if there are 4 successes in the 20 trials (because 4/20 = .2). Thus, to find the probability that p^ = p, we can use the binomial formula with j = 4, so n - j = 16: Pr{p^ = p} = 20C4p4(1 - p)16 = (4,845)(.24)(.816) = .2182. (b) The event p - .05 ≤ p^ ≤ p + .05 is equivalent to the event .15 ≤ p^ ≤ .25. This event occurs if there are 3, 4, or 5 successes in the 20 trials, as follows: Number of successes (j) 3 4 5 p^ .25 .30 .35 We can calculate the probabilities of these outcomes using the binomial formula with n = 20 and p = .2: Pr{p^ = .25} = 20C3p3(1 - p)17 = (1,140)(.23)(.817) = .20536 Pr{p^ = .30} = 20C4p4(1 - p)16 = (4,845)(.24)(.816) = .21820 Pr{p^ = .35} = 20C5p5(1 - p)15 = (15,504)(.25)(.815) = .17456 Thus, Pr{p - .05 ≤ p^ ≤ p + .05} = Pr{.15 ≤ p^ ≤ .35} = .20536 + .21820 + .17456 = .59812 ≈ .5981. •5.48 We are concerned with the sampling distribution of p^ for n = 20 and p = .2. According to part (b) of Theorem 5.2, this sampling distribution can be approximated by a normal distribution with 80 mean p = .2 and p(1 - p) (.2)(.8) = n 20 = .08944. To apply continuity correction, we first calculate the half-width of a histogram bar (on the p^ scale) as 1 1 (2 )(20 ) = .025. standard deviation = (a) We wish to find Pr{p^ = .2}. Thus, we wish to find the area under the normal curve between .2 - .025 = .175 and .2 + .025 = .225. The desired area is shaded in the figure. The boundary .225 corresponds to .225 - .200 z = .08944 = .28. From Table 3, the area below .28 is .6103. The boundary .175 corresponds to .175 - .200 z = .08944 = -.28. From Table 3, the area below -.28 is .3897. Thus, the normal approximation to the probability is Pr{p^ = p} ≈ .6103 - .3897 = .2206. Note that this agrees well with the exact value (.2182) found in Exercise 5.40(a). (b) The event p - .05 ≤ p^ ≤ p + .05 is equivalent to the event .15 ≤ p^ ≤ .25. Thus, we wish to find the area under the normal curve between .15 - .025 = .125 and .25 + .025 = .275. The desired area is shaded in the figure. The boundary .275 corresponds to .275 - .200 z = .08944 = .84. From Table 3, the area below .84 is .7995. The boundary .175 corresponds to .125 - .200 z = .08944 = -.84. From Table 3, the area below -.84 is .2005. Thus, the normal approximation to the probability is Pr{.15 ≤ p^ ≤ .25} ≈ .7995 - .2005 = .5990. Note that this agrees quite well with the exact value (.5981) found in Exercise 5.40(b). 5.49 For the normal approximation to the sampling distribution of p^ , the mean is p = .42 and the SD is p(1 - p) (.42)(.58) = = .0987. n 25 81 1 1 Continuity correction: (2 )(25 ) = .02. .46 − .42 = .405; Table 3 gives .6590. .0987 1 - .6590 = .3410. z= 5.50 µ = 1,200; σ = 35. For Pr{1175 ≤ Y ≤ 1225} 1225 - 1200 = .71; Table 3 gives .7611. 35 1175 - 1200 = -.71; Table 3 gives .2389. z= 35 .7611 - .2389 = .5222. z= For Pr{1175 ≤ Y ≤ 1225}, σ - = σ/ n = 35/ 6 = 14.29. Y 1225 - 1200 z= = 1.75; Table 3 gives .9599. 14.29 1175 - 1200 = -1.75; Table 3 gives .0401. z= 14.29 .9599 - .0401 = .9198. Comparison: .9189 > .5222; this shows that the mean of 6 counts is more precise, in that it is more likely to be near the correct value (1200) than is a single count. •5.51 µ = 8.3; σ = 1.7. If the total weight of 10 mice is 90 gm, then their mean weight is 90 10 = 9.0 gm. Thus, we wish to find the percentage of litters for which y– ≥ 9.0 gm. We are concerned with the sampling – – distribution of Y for n = 10. From Theorem 5.1, the mean of the sampling distribution of Y is µ - = µ = 8.3, Y the standard deviation is 1.7 σ = = .538, σ- = Y n 10 and the shape of the distribution is normal because the population distribution is normal (part 3a of Theorem 5.1). We need to find the shaded area in the figure. – For y = 9.0, –y - µ Y 9.0 - 8.3 = .538 = 1.30. z= σY From Table 3, the area below 1.30 is .9032. Thus, the percentage with y– ≥ 9.0 is 1 - .9032 = .0968, or 9.68%. 82 5.52 Two possible factors are: (a) environmental and genetic differences between litters; (b) competition between mice in a litter. 5.53 (a) p^ .0 .2 .4 .6 .8 1.0 Probability .85 (5)(.2)(.84) (10)(.22)(.83) (10)(.23)(.82) (5)(.24)(.8) .25 = = = = = = .3277 .4096 .2048 .0512 .0064 .0003 83 (b) Probability .50 .40 .30 .20 .10 0 .2 .4 .6 .8 1.0 ^p – 5.54 The sample average, Y , of the heights of the plants varies from one sample to the next. The sampling – distribution of the sample average is the distribution of Y -- the average height of plants in a sample -across repeated samples. That is, the sampling distribution of the sample average is the distribution of sample average plant heights in samples of size 28. σ = 5.55 σ - = Y n 4 = 0.76. 28 10 σ = = 1.25. 5.56 σ - = Y n 64 52 - 50 (a) z = 1.25 = 1.6; Table 3 gives .9452. 48 - 50 z = 1.25 = -1.6; Table 3 gives .0548. .9452 - .0548 = .8904. 102 - 100 = 1.6; Table 3 gives .9452. 1.25 98 - 100 z = 1.25 = -1.6; Table 3 gives .0548. .9452 - .0548 = .8904. (b) z = µ+2-µ 1.25 = 1.6; Table 3 gives .9452. µ-2-µ z = 1.25 = -1.6; Table 3 gives .0548. .9452 - .0548 = .8904. (c) z =