HCMC University of Technology Dung Nguyen Probability and Statistics Confidence Intervals Outline I 1 Point estimation and Interval estimation 2 Confidence Intervals for Parameters of Normal Distribution. 3 Confidence Intervals for Other Distributions Dung Nguyen Probability and Statistics 2/48 Outline II 4 Summary Dung Nguyen Probability and Statistics 3/48 Point estimation and Interval estimation 1 Point estimation and Interval estimation Point estimation Interval estimation Dung Nguyen Probability and Statistics 4/48 Point estimation and Interval estimation Population vs. Point estimation Sample A population is a collection of objects, items, humans/animals about which information is sought. A sample is a part of the population that is observed. A parameter is a numerical characteristic of a population, A statistic is a numerical function of the sampled data, used to estimate an unknown parameter. Dung Nguyen Probability and Statistics 5/48 Point estimation and Interval estimation Point estimation Some characteristics of samples Sample mean Sample variance/standard deviation Sample median Sample interquartile range Sample proportion Dung Nguyen Probability and Statistics 6/48 Point estimation and Interval estimation Point estimation The Sample Proportion Relative frequency estimate of p is k/n. The estimated value of p ∈ [0, 1]. Example (1) 5023 Heads are observed on 10000 tosses. The relative frequency estimate of p is 0.5023 Is it possible that actually p = 0.5 instead? Is it possible that actually p = 0.51? Dung Nguyen Probability and Statistics 7/48 Point estimation and Interval estimation Interval estimation Interval Estimates An interval estimate estimates the value of p as being in an interval (a, b) or [a, b] Example (2) 5023 Heads are observed on 10000 tosses. An interval estimate is of the form 0.4973 < p < 0.5073 0.5013 ≤ p ≤ 0.5033 The length of the interval is a crucial parameter of the estimate. Dung Nguyen Probability and Statistics 8/48 Point estimation and Interval estimation Interval estimation Confidence Interval How sure are we that the unknown value of p actually is in the interval specified? [0, 1]: 100% confident. Smaller intervals: lesser degree of confidence. “0.4973 < p < 0.5073” vs. “0.5013 ≤ p ≤ 0.5033”. Dung Nguyen Probability and Statistics 9/48 Point estimation and Interval estimation Interval estimation Confidence Interval and Level (X1, . . . , Xn) is a random sample from a distribution depending on a parameter θ A confidence interval for θ: S1 ≤ θ ≤ S2, where S1 and S2 are computed from the sample data. called the lower- and upper- confidence limits. The confidence level: γ = Pθ (S1 ≤ θ ≤ S2). Wide interval ⇐⇒ high confidence level Dung Nguyen Probability and Statistics 10/48 Point estimation and Interval estimation Interval estimation Confidence level and Significance level A confidence level (γ) is a measure of the degree of reliability of the interval. A significance level (α) is the probability we allow ourselves to be wrong when we are estimating a parameter with a confidence interval. γ+α=1 Dung Nguyen Probability and Statistics 11/48 Point estimation and Interval estimation Interval estimation One-Sided Confidence Intervals Definition Let S1 be a statistic: of θ, P(S1 < θ) = γ. (S1, ∞) is called for all values a left-sided 100γ percent CI for θ. S1 is called a 100γ percent lower confidence limit for θ. Dung Nguyen Probability and Statistics 12/48 Point estimation and Interval estimation Interval estimation One-Sided Confidence Intervals Definition Let S2 be a statistic: of θ, P(θ < S2) = γ. (−∞, S2) is called for all values a right-sided 100γ percent CI for θ. S2 is called a 100γ percent lower confidence limit for θ. Dung Nguyen Probability and Statistics 13/48 Point estimation and Interval estimation Interval estimation X1, . . . , Xn ∼ N(µ, σ 2) are iid Known σ Unknown σ X1, . . . , Xn are iid & n ≫ 1 Arbitrary distribution Bernoulli distribution -> Proportion Dung Nguyen Probability and Statistics 14/48 Confidence Intervals for Parameters of Normal Distribution. 2 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ Normal Population + Unknown σ Dung Nguyen Probability and Statistics 15/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ Normal Population + Known σ Theorem If X1, . . . , Xn are iid ∼ N (µ, σ 2), then µ b−µ √ ∼ N (0, 1). σ/ n Sample size Let MOE = √σn · zα/2. CI of population mean If X1, . . . , Xn are iid ∼ N (µ, σ 2) then σ µ=µ b ± zα/2 · √ . n Then MOE ≤ ϵ0 ⇐⇒ n ≥ Dung Nguyen σ · zα/2 ϵ0 2 . Probability and Statistics 16/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ Example 3 - Pit Stop In auto racing, a pit stop is where a racing vehicle stops for new tires, fuel, repairs, and other mechanical adjustments. The efficiency of a pit crew that makes these adjustments can affect the outcome of a race. A random sample of 32 pit stop times has a sample mean of 12.9 seconds. Assume that the population distribution is normal and the population standard deviation is 0.19 second. Dung Nguyen Probability and Statistics 17/48 Confidence Intervals for Parameters of Normal Distribution. a b Normal Population + Known σ Construct a 99% confidence interval for the mean pit stop time. How many observations must be collected to ensure that the radius of the 99% CI is at most 0.01? Dung Nguyen Probability and Statistics 18/48 Confidence Intervals for Parameters of Normal Distribution. a b Normal Population + Known σ Construct a 99% confidence interval for the mean pit stop time. How many observations must be collected to ensure that the radius of the 99% CI is at most 0.01? Solution 0.19 = 12.9 ± 0.087 12.9 ± 2.58 · √ 32 n ≥ 2403 or 2395.198. Dung Nguyen Probability and Statistics 18/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ One-Sided Confidence Interval (Normal Population + Known σ) A γ upper-confidence bound (aka right-sided confidence interval) for µ is σ µ≤µ b + zα · √ . n A γ lower-confidence bound (aka left-sided confidence interval) for µ is σ µ≥µ b − zα · √ . n Dung Nguyen Probability and Statistics 19/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ Example 4 - Pit Stop In auto racing, a pit stop is where a racing vehicle stops for new tires, fuel, repairs, and other mechanical adjustments. A random sample of 32 pit stop times has a sample mean of 12.9 seconds. Assume that the population distribution is normal and the population standard deviation is 0.19 second. Construct a right-sided 95% CI for the population mean. Dung Nguyen Probability and Statistics 20/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Known σ One-sided vs Two-sided Two-sided CI One-sided CI −zα/2 µ b zα/2 zα Dung Nguyen Probability and Statistics 21/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ 0.4 N(0, 1) t(15) t(2) 0.3 0.2 0.1 −3 −2 −1 0 1 2 Figure: Pdf of N(0, 1) and t(df) Dung Nguyen Probability and Statistics 22/48 3 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Normal Population + Unknown σ Theorem If X1, . . . , Xn are i.i.d. ∼ N (µ, σ 2), then µ b−µ (n − 1)s2 χ2n−1. √ ∼ tn−1 and ∼ σ2 s/ n CI of the population mean If X1, . . . , Xn are i.i.d. ∼ N (µ, σ 2) then s µ=µ b ± tn−1,α/2 · √ . n Dung Nguyen Probability and Statistics 23/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Example 5 - Tread Depth 11 randomly selected automobiles were stopped, and the tread depth of the right front tire was measured. The sample mean was 0.32 inch, and the sample standard deviation was 0.08 inch. Find the 95% confidence interval of the mean depth. Assume that the variable is approximately normally distributed. Dung Nguyen Probability and Statistics 24/48 Confidence Intervals for Parameters of Normal Distribution. Dung Nguyen Normal Population + Unknown σ Probability and Statistics 25/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Solution 0.08 µ = 0.32 ± 2.228 · √ =⇒ µ = 0.32 ± 0.05. 11 Dung Nguyen Probability and Statistics 26/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Example 6 - Point of inflammation of Diesel oil Five independent measurements of the point of inflammation of Diesel oil gave the values (in F) 144 147 146 144 142 Assuming normality, determine a 99% confidence interval for the mean. Dung Nguyen Probability and Statistics 27/48 Confidence Intervals for Parameters of Normal Distribution. Dung Nguyen Normal Population + Unknown σ Probability and Statistics 28/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Solution µ b = 144.6, s = 1.949. Thus 1.949 µ = 144.6 ± 4.604 · √ = 144.6 ± 4.014 5 Required values: Dung Nguyen Probability and Statistics 29/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ CI of the population variance Choose c1 and c2 so that the area in each tail of χ2n−1 distribution is α/2. Then the γ-confidence interval for the unknown variance σ 2 is (n − 1)s2 (n − 1)s2 2 ≤σ ≤ . c2 c1 Dung Nguyen Probability and Statistics 30/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ CI of the population variance Choose c1 and c2 so that the area in each tail of χ2n−1 distribution is α. The γ lower and upper confidence bounds on σ 2 are (n − 1)s2 2 , σ ≥ c2 and (n − 1)s2 2 σ ≤ . c1 Dung Nguyen Probability and Statistics 31/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Example 7 An automatic filling machine is used to fill bottles with liquid detergent. A random sample of 20 bottles results in a sample variance of fill volume of s2 = 0.01532. Assume that the fill volume is approximately normal. Compute a 95% upper confidence bound. Dung Nguyen Probability and Statistics 32/48 Confidence Intervals for Parameters of Normal Distribution. Normal Population + Unknown σ Solution σ2 ≤ (20 − 1)0.0153 = 0.0287, 10.117 and σ ≤ 0.17. Dung Nguyen Probability and Statistics 33/48 Confidence Intervals for Other Distributions 3 Confidence Intervals for Other Distributions Large Sample CIs for Population Means Large-Sample CIs for Population Proportions Dung Nguyen Probability and Statistics 34/48 Confidence Intervals for Other Distributions Large Sample CIs for Population Means Large Sample Size Theorem If X1, . . . , Xn are i.i.d. then µ b−µ µ b−µ √ ≈ √ ≃ N (0, 1) s/ n σ/ n CI of population mean - Large sample size If X1, . . . , Xn are i.i.d. and n is large then σ µ≈µ b ± zα/2 · √ . n Dung Nguyen Moreover, if σ is unknown then we estimate σ ≈ s and s µ≈µ b ± zα/2 · √ . n Probability and Statistics 35/48 Confidence Intervals for Other Distributions Large Sample CIs for Population Means Example 8 A random sample of 110 lighting flashes in a region resulted in a sample average radar echo duration of 0.81 s and a sample standard deviation of 0.34 s. Calculate a 99% (two-sided) CI for the true average echo duration. Dung Nguyen Probability and Statistics 36/48 Confidence Intervals for Other Distributions Large Sample CIs for Population Means Example 9 A sample of fish was selected from Florida lakes, and mercury concentration in the muscle tissue was measured (ppm). 1.230 1.330 0.040 0.044 0.490 0.190 0.830 0.810 0.490 1.160 0.050 0.150 1.080 0.980 0.630 0.560 0.590 0.340 0.340 0.840 0.280 0.340 0.750 0.870 0.180 0.190 0.040 0.490 0.100 0.210 0.860 0.520 0.940 0.400 0.430 0.250 Find an approximate 95% CI on µ. Dung Nguyen Probability and Statistics 37/48 Confidence Intervals for Other Distributions Large Sample CIs for Population Means Solution n = 36, µ b = 0.5284, s2 = 0.1361, s = 0.3690, z0.025 = 1.96. Then the CI 0.3690 0.5284 ± 1.96 √ = 0.5284 ± 0.1205 36 = [0.4079, 0.6490] Dung Nguyen Probability and Statistics 38/48 Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Population Proportion Corollary Let X ∼ B(n, p) and assume np ≥ 10, nq ≥ 10. Then p̂ − p p ≃ N(0, 1). pq/n Dung Nguyen Probability and Statistics 39/48 Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Population Proportion An approximate 100γ% CI for p √ p̂q̂ p ≈ p̂ ± zα/2 · √ . n The approximate 100γ% lower and upper confidence bounds √ p̂q̂ p ≳ p̂ − zα · √ , n and √ p̂q̂ p ≲ p̂ + zα · √ . n Dung Nguyen Probability and Statistics 40/48 respectively. Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Example 10 - Population Proportion An article reported that in n = 45 trials in a particular laboratory, 16 resulted in ignition of a particular type of substrate by a lighted cigarette. Let p denote the long-run proportion of all such trials that would result in ignition. Find a confidence interval for p with a confidence level of about 95%. Dung Nguyen Probability and Statistics 41/48 Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Solution A point estimate for p is p̂ = 16/45 = 0.36. A confidence interval for p is p 0.36 ± 1.96 0.36 · 0.64/45 = 0.36 ± 0.14. Dung Nguyen Probability and Statistics 42/48 Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Find the sample size Let √ MOE = zα/2 · √p̂nq̂ . Then zα/2 MOE ≤ ϵ0 ⇐= n ≥ 0.25 ϵ0 Dung Nguyen 2 . Probability and Statistics 43/48 Confidence Intervals for Other Distributions Large-Sample CIs for Population Proportions Find the sample size Let √ MOE = zα/2 · √p̂nq̂ . Then zα/2 MOE ≤ ϵ0 ⇐= n ≥ 0.25 ϵ0 2 . Example How many people do you need to survey so that the margin of error (95%) is plus or minus 3% points? This means that 95% of the time, the survey estimate should be within 3% points of the true answer. Dung Nguyen Probability and Statistics 43/48 Summary Example 11 - z vs t A random sample of 32 pit stop times has a sample mean of 12.9 seconds and a sample standard deviation of 0.20 seconds. Assume that the population distribution is normal and the population standard deviation is 0.19 second. Construct a CI. 1 µ=µ b ± zα/2 · √σn . (exact CI) 2 µ=µ b ± tn−1,α/2 · √sn . (exact CI) 3 µ=µ b ± zα/2 · √sn . (approximate CI) Dung Nguyen Probability and Statistics 44/48 Summary zα/2 vs tα/2 N(0, 1) t(df) z α2 Dung Nguyen t α2 Probability and Statistics 45/48 Summary Example 12 - Which case? x 9.62 4.09 1.70 10.62 4.73 2.40 4.05 8.41 6.77 4.16 y 9.18 4.70 2.57 0.22 1.82 0.82 3.98 6.06 0.24 0.21 Dung Nguyen Probability and Statistics 46/48 Summary Example 12 - Which case? x 9.62 4.09 1.70 10.62 4.73 2.40 4.05 8.41 6.77 4.16 y 9.18 4.70 2.57 0.22 1.82 0.82 3.98 6.06 0.24 0.21 1 µ=µ b ± zα/2 · √σn . (exact CI) 2 µ=µ b ± tn−1,α/2 · √sn . (exact CI) 3 µ=µ b ± zα/2 · √sn . (approximate CI) Dung Nguyen Probability and Statistics 46/48 Summary Which case? x 9.62 4.09 1.70 10.62 4.73 2.40 4.05 8.41 6.77 4.16 y 9.18 4.70 2.57 0.22 1.82 0.82 3.98 6.06 0.24 0.21 5 4 4 3 3 2 2 1 0 1 0 2 4 6 8 10 12 Dung Nguyen 0 0 2 4 6 Probability and Statistics 47/48 8 10 Summary Summary X1, . . . , Xn ∼ N(µ, σ 2) are iid σ Known σ: µ = µ b ± zα/2 √ n s Unknown σ: µ = µ b ± tα/2 √ n X1, . . . , Xn are iid & n ≫ 1 Arbitrary distribution: σ s µ≈µ b ± zα/2 √ ≈ µ b ± zα/2 √ n n p Bernoulli distribution: p ≈ p̂ ± zα/2 p̂q̂/n Dung Nguyen Probability and Statistics 48/48
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )