Statistics 1. Test 3 Name: KEY Find the area under the standard normal distribution curve for each. (a) Between z = −0.19 and z = 1.23. . normalCdf (−0.19, 1.23) = 0.4660 (b) To the left of z = −1.56. . normalCdf (−100, −1.56) = 0.0594 (c) To the right of z = −0.38. . normalCdf (−0.38, 100) = 0.6480 2. Find each probability using the standard normal distribution curve for each. (a) P (−0.09 < z < 2.42) . = normalCdf (−0.09, 2.42) = 0.5281 (b) P (z > −1.68) . = normalCdf (−1.68, 100) = 0.9535 (c) P (z < 0.23) . = normalCdf (−100, 0.23) = 0.5910 3. Find the indicated z score. The graph depicts the standard normal distribution with mean 0 and standard deviation 1. (a) . z = invnorm(0.8980) = 1.27 (b) . z = invnorm(1 − 0.2873) = 0.56 4. Find the critical value of z. (a) . z0.12 = invorm(1 − 0.12) = 1.17 (b) . z0.08 = invorm(1 − 0.08) = 1.41 5. Find the critical value of z that represents the 45th percentile. . z = invnorm(0.45) = -0.13 6. The waiting time in line at a Starschmuchs Coffee is normally distributed with a mean of 3.2 minutes and a standard deviation of 1.3 minutes. Find the probability that a randomly selected customer has to wait (a) Less than 1 minute. µ = 3.2 and σ = 1.3. Let X = the continuous random variable (CRV) representing a randomly selected wait time. We convert the distribution of X values over to its distribution of z scores, and find the corresponding area under the standard(ized) normal curve. Normal Distribution of Wait Times X −µ P (X < 1) = P z < σ X 1 − 3.2 =P z< 1.3 . = P (z < −1.69) Standard Normal Distribution = normalCdf (−100, −1.69) z . = 0.0455 Alternatively, we could just find the area under the distribution of wait times, X, with . P (X < 1) = normalCdf (−∞, X, µ, σ) = normalCdf (−1010 , 1, 3.2, 1.3) = 0.0453 (b) more than 2 minutes. X −µ P (X > 2) = P z > σ 2 − 3.2 =P z> 1.3 . = P (z > −0.92) = normalCdf (−0.92, 100) . = 0.8212 Normal Distribution of Wait Times X Standard Normal Distribution z Alternatively, we could just find the area under the distribution of wait times, X, with . P (X > 2) = normalCdf (X, ∞, µ, σ) = normalCdf (2, 1010 , 3.2, 1.3) = 0.8220 (c) between 0.75 minutes and 2 minutes. Normal Distribution of Wait Times P (0.75 < X < 2) 2 − 3.2 0.75 − 3.2 <z< =P 1.3 1.3 . = P (−1.88 < z < −0.92) X Standard Normal Distribution = normalCdf (−1.88, −0.92) z . = 0.1487 7. . P (0.75 < X < 2) = normalCdf (0.75, 2, 3.2, 1.3) = 0.1482 The average yearly precipitation in San Diego is 9.62 inches with a standard deviation of 4.42 inches and precipitation amounts are normally distributed. (a) Find the probability that a randomly selected year will have precipitation greater than 12 inches. µ = 9.62 in. and σ = 4.42 in. Let X = the continuous random variable (CRV) representing a randomly selected yearly precipitation amount. Normal Distribution of Precipitations X −µ P (X > 12) = P z > σ 12 − 9.62 =P z> 4.42 X Standard Normal Distribution . = P (z > 0.54) = normalCdf (0.54, 100) z . = 0.2946 Alternatively, we could just find the area under the distribution of precipitations, X, with . P (X > 12) = normalCdf (X, ∞, µ, σ) = normalCdf (12, 1010 , 9.62, 4.42) = 0.2951 (b) Find the probability that five randomly selected years will have an average precipitation greater than 8 inches. Let X = the continuous random variable (CRV) representing a randomly selected sample mean yearly precipitation amount. X −µ √ P (X > 8) = P z > σ/ n Sampling Distn. of Sample Means 8 − 9.62 √ =P z> 4.42/ 5 X √ . µX = 9.62 and σX = 4.42/ 5 = 1.98 = P (z > −0.82) = normalCdf (−0.82, 100) Standard Normal Distribution z . = 0.7939 Alternatively, we could just find the area under the sampling distribution of sample mean precipitations, X, with √ . P (X > 8) = normalCdf (X, ∞, µX , σX ) = normalCdf (8, 1010 , 9.62, 4.42/ 5) = 0.7938 (7c cont.) Find the precipitation amount from the distribution of precipitations that represents the 75th percentile. precipitation, X . X = invnorm(percentile, µ, σ) = invnorm(0.75, 9.62, 4.42) = 12.6 in Alternatively, we first find the z value from the standard normal distribution that corresponds to the 75th percentile. Second, we solve z = X −µ for X to get σ X =µ+z·σ Afterwards, replace µ with 9.62, z with 0.6744897495 and σ with 4.42 so that . X = 9.62+(0.6744897495)·(4.42) = 12.6 in. z = invnorm(0.75) . = 0.6744897495 8. Some passengers died when a water taxi sank in Baltimore’s inner harbor. Men are typically heavier than women and children, so when loading a water taxi, let’s assume a worst-case scenario in which all passengers are men. Based on data from the National Health and Nutrition Survey, assume that weights of men are normally distributed with a mean of 172 lb. and a standard deviation of 29 lb. (a) Find the probability that if an individual man is randomly selected, his weight will be greater than 175 lb. µ = 172 lb. and σ = 29 lb. Let X = the continuous random variable (CRV) representing the weight of a randomly selected man. Distn. of Men’s Weights, X X −µ P (X > 175) = P z > σ 175 − 172 =P z> 29 . = P (z > 0.10) Standard Normal Distribution z = normalCdf (0.10, 100) . = 0.4602 Alternatively, we could just find the area under the distribution of weights, X, with . P (X > 175) = normalCdf (X, ∞, µ, σ) = normalCdf (175, 1010 , 172, 29) = 0.4588 (b) Find the probability that 20 men will have a mean weight that is greater than 175 lb. (so that their total weight exceeds the safe capacity of 3500 lb. Let X = the continuous random variable (CRV) representing a randomly selected sample mean weight. X −µ √ P (X > 175) = P z > σ/ n 175 − 172 √ =P z> 29/ 20 Sampling Distn. of Sample Means µX = 172 and √ . σX = 29/ 20 = 6.5 . = P (z > 0.46) = normalCdf (0.46, 100) . = 0.3228 Alternatively, we could just find the area under the sampling distribution of sample mean weights, X, with X P (X > 175) = normalCdf (X, ∞, µX , σX ) √ . = normalCdf (175, 1010 , 172, 29/ 20) = 0.3218 9. The average per capita spending on health care in the United States is $5274. The standard deviation is $600 and the distribution of health care spending is approximately normal. Find the limits of the middle 50% of individual health care expenditures. Let X be the continuous random variable (CRV) representing a randomly selected individual health care expenditure. Find the z scores from the standard normal distribution corresponding to the 25th and 75th percentiles. Then use Standard Normal Distribution X =µ+z·σ . z. 0.75 = invnorm(0.25) = −0.6744897495, . and by symmetry, z0.25 = 0.6744897495 . xlow = µ+z·σ = $5274+(−0.6744897495)·($600) ≈ $4869.31 . xhigh = µ+z·σ = $5274+(0.6744897495)·($600) ≈ $5678.69 Alternatively, we could bypass using the standard normal distribution of z scores with: . xlow = invnorm(percentile, µ, σ) = invnorm(0.25, 5274, 600) = $4869.31 . xhigh = invnorm(percentile, µ, σ) = invnorm(0.75, 5274, 600) = $5678.69 10. A prestigious college decides to only take applications from student who have scored in the top 5% on the SAT test. The SAT scores are approximately normally distributed with a mean of 490 and a standard deviation of 70. Find the score that is necessary to obtain in order to qualify for applying to this college. Let X be the continuous random variable (CRV) representing SAT scores. A student must score in the 95th or higher percentile in order to be admitted. First, we find the z value from the standard normal distribution that corresponds to the 95th percentile. Second, we solve z = X −µ for X to get σ X =µ+z·σ Afterwards, replace µ with 490, z with 1.644853626 and σ with 70 so that . X = 490 + (1.644853626) · (70) = 605 An alternate solution route is z = invnorm(0.95) . = 1.644853626 X = invnorm(percentile, µ, σ) . = invnorm(0.95, 490, 70) = 605 11. Americans ate an average of 25.7 pounds of Krusty-O Cereal each last year and spent an average of $61.50 per person doing so. If the standard deviation for consumption is 3.75 pounds and the standard deviation for the amount spent is $5.89, find the following: (a) The probability that the sample mean Krusty-O cereal consumption for a random sample of 40 American consumers exceeded 27 pounds. Let X = the continuous random variable (CRV) representing a randomly selected sample mean consumption amount. X −µ √ P (X > 27) = P z > Sampling Distn. σ/ n of Sample Means 27 − 25.7 √ =P z> 3.75/ 40 µX = 25.7 lb. and √ . σX = 3.75/ 40 = 0.59 lb. X . = P (z > 2.19) = normalCdf (2.19, 100) . = 0.0143 Std. Normal Distn. z Alternatively, we could just find the area under the sampling distribution with √ . P (X > 27) = normalCdf (X, ∞, µX , σX ) = normalCdf (27, 1010 , 25.7, 3.75/ 40) = 0.0142 (b) The probability that for a random sample of 50, the the average yearly amount spent on Krusty-O Cereal was between $60.00 and $100. Let X = the continuous random variable (CRV) representing a randomly selected sample mean yearly amount spent. P (60 < X < 100) =P =P X −µ X −µ √ <z< √ σ/ n σ/ n Sampling Distn. of Sample Means 60 − 61.50 100 − 61.50 √ √ <z< 5.89/ 50 5.89/ 50 X µX = $61.50 and √ . σX = 5.89/ 50 = $0.83 . = P (−1.8 < z < 46.22) . = normalCdf (−1.8, 46.22) = 0.9641 Std. Normal Distn. Alternatively, we could just find the area under the sampling distribution with √ . P (60 < X < 100) = normalCdf (60, 100, 61.50, 5.89/ 50) = 0.9641 z 12. Use the normal approximation of the binomial probability distribution to find the probabilities for the discrete random variable, X. (a) Find the probability that X is 19, assuming n = 40 and p = 0.5. We find that µ = n · p = 40 · 0.5 = 20, σ = and nq ≥ 5. Thus, √ npq = √ 10, and both np ≥ 5 . P (X = 19) = P (18.5 < X < 19.5) Binomial Probability Distribution X −µ X −µ <z< =P σ σ X 18.5 − 20 19.5 − 20 √ =P <z< √ 10 10 . = P (−0.47 < z < −0.16) = normalCdf (−0.47, −0.16) Standard Normal Distribution . = 0.1173 z (b) Find the probability that X is 3, assuming n = 25 and p = 0.4. We find that µ = n · p = 25 · 0.4 = 10, σ = nq ≥ 5. Thus, . P (X = 3) = P (2.5 < X < 3.5) X −µ X −µ =P <z< σ σ 3.5 − 10 2.5 − 10 √ P <z< √ 6 6 . = P (−3.06 < z < −2.65) = normalCdf (−3.06, −2.65) . = 0.0029 √ npq = √ 6, and both np ≥ 5 and Binomial Probability Distribution X Standard Normal Distribution z (c) Find the probability that X is at least 15, assuming n = 30 and p = 0.5. We find that µ = n · p = 30 · 0.5 = 15, σ = and nq ≥ 5. Thus, . P (X ≥ 15) = P (X > 14.5) X −µ =P z> σ 14.5 − 15 =P z> √ 7.5 . = P (z > −0.18) √ √ npq = 7.5, and both np ≥ 5 Binomial Probability Distribution X Standard Normal Distribution = normalCdf (−0.18, 100) . = 0.5714 13. z Use the normal approximation of the binomial probability distribution to find the probabilities for the discrete random variable, X. (a) Find the probability that X is fewer than 5, assuming n = 300 and p = 0.07. √ √ We find that µ = n · p = 300 · 0.07 = 21, σ = npq = 19.53, and both np ≥ 5 and nq ≥ 5. Thus, Binomial Probability Distribution . P (X < 5) = P (X < 4.5) X −µ =P z< σ 4.5 − 21 =P z< √ 19.53 . = P (z < −3.73) X Standard Normal Distribution = normalCdf (−100, −3.73) . = 0.0001 z (b) Find the probability that X is at most 8, assuming n = 100 and p = 0.13. √ √ We find that µ = n · p = 100 · 0.13 = 13, σ = npq = 11.31, and both np ≥ 5 and nq ≥ 5. Thus, Binomial Probability Distribution . P (X ≤ 8) = P (X < 8.5) X −µ =P z< σ 8.5 − 13 =P z< √ 11.31 . = P (z < −1.33) X Standard Normal Distribution = normalCdf (−100, −1.33) . = 0.0918 (c) z Find the probability that X is more than 35, assuming n = 50 and p = 0.6. √ √ √ We find that µ = n · p = 50 · 0.6 = 30, σ = npq = 12 = 2 3, and both np ≥ 5 and nq ≥ 5. Thus, Binomial Probability Distribution . P (X > 35) = P (X > 35.5) X −µ =P z> σ 35.5 − 30 =P z> √ 12 . = P (z > 1.59) X Standard Normal Distribution = normalCdf (1.59, 100) . = 0.0559 z 14. Use the normal approximation of the binomial probability distribution. Two out of five adult smokers acquired the habit by age 14. If 400 smokers are randomly selected, find the probability that 170 or fewer acquired the habit by age 14. 2 We determine that p = = 0.4 and n = 400. Let X be the discrete random variable 5 representing the number of smokers out of 400 who the habit by age 14. √ acquired √ √ Then, µ = n · p = 400 · 0.4 = 160, σ = npq = 96 = 4 6, and both np ≥ 5 and nq ≥ 5. Thus, Binomial Probability Distribution . P (X ≤ 170) = P (X < 170.5) X −µ =P z< σ X 170.5 − 160 √ =P z< 96 . = P (z < 1.07) Standard Normal Distribution = normalCdf (−100, 1.07) . = 0.8577 z 15. Use the normal approximation of the binomial probability distribution. According to Mars (the candy company), 24% of M&Ms plain candies are blue. Assuming that the claimed blue M&Ms rate of 24% is correct, find the probability of randomly selecting 100 M&Ms and getting at most 20 that are blue. We determine that p = 0.24 and n = 100. Let X be the discrete random variable representing the √ number of blue M&Ms out of 100. Then, µ = n · p = 100 · 0.24 = 24, √ σ = npq = 18.24, and both np ≥ 5 and nq ≥ 5. Thus, Binomial Probability Distribution . P (X ≤ 20) = P (X < 20.5) X −µ =P z< σ 20.5 − 24 =P z< √ 18.24 X . − = P (z < 0.82) = normalCdf (−100, −0.82) Standard Normal Distribution . = 0.2061 z 16. Find the critical value zα/2 that corresponds to a 92% confidence interval. A 92% Confidence interval means that 0.92 = 1 − α, so that α = 0.08. Then, α/2 = 0.08/2 = 0.04 and . zα/2 = z0.04 = invnorm(1−α/2) = invnorm(0.96) = 1.75 17. First-semester GPAs for a random selection of freshmen at a large university are shown below. Estimate the true mean GPA of the freshman class with 99% confidence. Assume σ = 0.62 and that the distribution of first-semester GPAs is normal. 1.9 2.8 2.5 3.1 3.2 3.0 2.7 2.7 2.0 3.8 2.8 3.5 2.9 2.7 3.2 3.8 2.7 2.0 3.0 3.9 3.3 1.9 3.8 2.7 . We find the sample mean is X = 2.9125 and 99% confidence implies . zα/2 = z0.005 = invnorm(1 − 0.005) = 2.58. Then, σ σ X − zα/2 · √ < µ < X + zα/2 · √ , n n or 0.62 0.62 2.9125−2.58 √ < µ < 2.9125+2.58 √ , 24 24 or 2.58 < µ < 3.24 18. Find the critical value tα/2 that corresponds to a 90% interval, assuming n = 10. A 90% Confidence interval means that 0.90 = 1 − α, so that α = 0.10. Then, α/2 = 0.10/2 = 0.05 and . tα/2 = t0.05 = invT (1−α/2, n−1) = invT (0.95, 9) = 1.833112 ≈ 1.83 19. The approximate costs (in thousands) for a 30-second spot for various cable networks in a random selection of cities are shown below. Estimate the true population mean cost for a 30-second advertisement on cable network with 90% confidence.Assume the population of costs is approximately normal. 14 22 55 12 165 13 9 54 15 73 66 55 23 41 30 78 150 . We find the sample mean is X = 51.4705, sample standard deviation is . . s = 45.9839 and 90% confidence implies tα/2 = t0.05 = invT (0.95, 16) = 1.74588 ≈ 1.75. Then, s s X − tα/2 · √ < µ < X + tα/2 · √ , n n or 45.9839 45.9839 √ √ < µ < 51.4705+1.74588 , 51.4705−1.74588 17 17 or 32.0 < µ < 70.9 20. A university dean of students wishes to estimate the average number of hours students spend doing homework per week. The standard deviation from a previous study is 6.2 hours. How large a sample must be selected if he wants to be 99% confident of finding whether the true mean differs from the sample mean by 1.5 hours? We are asked to determine the size, n, of a sample necessary for an interval estimate of the average weekly study amount. The formula is zα/2 · σ 2 n= E We are told to assume σ = 6.2 and E = 1.5. If 99% Confidence is desired, then 0.99 = 1 − α, or α = 0.01. This implies we should use zα/2 = z0.005 = invnorm(0.995) . = 2.58. Then, 2 zα/2 · σ 2 (2.58) · (6.2) . = = 114 n= E 1.5 21. Thirty randomly selected students took the calculus final. If the sample mean was 95 and the standard deviation was 6.6, construct a 99% confidence interval for the mean score of all students. We are not given σ, so we use a t distribution. We are given n = 30, X = 95 and s = 6.6. 99% confidence implies 0.99 = 1 − α, or α = 0.01, and . tα/2 = t0.005 = invT (0.995, 29) = 2.75638 Then, s s X − tα/2 · √ < µ < X + tα/2 · √ , n n or 6.6 6.6 95−2.75638 √ < µ < 95+2.75638 √ , 30 30 or 92 < µ < 98 22. A study of 35 golfers showed that their average score on a particular course was 92. The standard deviation of the population is 5. Find the 95% confidence interval of the mean score for all golfers. We are given σ = 5, so we use a z distribution. We are given n = 35 and X = 92. 95% confidence implies 0.95 = 1 − α, or α = 0.05, and . zα/2 = z0.025 = invnorm(0.975) = 1.96 Then, σ σ < µ < X + zα/2 · √ , X − zα/2 · √ n n or 5 5 92 − 1.96 √ < µ < 92 + 1.96 √ , 35 35 or 90.3 < µ < 93.7 23. A recent study of 75 workers found that 53 people rode the bus to work each day. Find the 95% confidence interval of the proportion of all workers who rode the bus to work. We are given n = 75 and X = 53, so the sample proportion is p̂ = 95% confidence implies 0.95 = 1 − α, or α = 0.05, . and zα/2 = z0.025 = invnorm(0.975) = 1.96 Then, r p̂ − zα/2 · p̂ · q̂ < p < p̂ + zα/2 · n r p̂ · q̂ , n or v v u u u 53 u 53 53 53 u · 1− u · 1− t 75 t 75 53 53 75 75 −1.96· < p < +1.96· , 75 75 75 75 or 0.60 < p < 0.81 53 . 75 24. It is believed that 25% of U.S. homes have a direct satellite television receiver. How large a sample is necessary to estimate the true population of homes which do with 95% confidence and within 3 percentage points? How large a sample is necessary if nothing is known about the proportion? p̂ · q̂ · (zα/2 )2 0.25 · 0.75 · (1.96)2 . n= = = 801 E2 (0.03)2 Assuming nothing is known about the proportion, (0.5)2 · (zα/2 )2 (0.5)2 · (1.96)2 . = n= = 1068 E2 (0.03)2 25. A recent poll showed results from 2000 professionals who interview job applicants. 26% of them said the biggest interview turnoff is that the applicant did not make an effort to learn about the job or the company. A 95% confidence interval estimate was used and the margin of error was ±3 percentage points. Describe what is meant by the statement “the margin of error was ±3 percentage points.” When using 26% to estimate the value of the population percentage, the maximum likely difference between 26% and the true population percentage is three percentage points, so the interval from 23% to 29% is likely to contain the true population percentage. The Sampling Distribution of Sample Proportions, p̂.