155S7.5_3  Estimating a Population Variance March 29, 2011 Key Concept MAT 155 Statistical Analysis

advertisement
155S7.5_3 Estimating a Population Variance
March 29, 2011
Key Concept
MAT 155 Statistical Analysis
Dr. Claude Moore
Cape Fear Community College
This section we introduce the chi­square probability distribution so that we can construct confidence interval estimates of a population standard deviation or variance. We also present a method for determining the sample size required to estimate a population standard deviation or variance.
Chapter 7
Estimates and Sample Sizes
7­1 Review and Preview
7­2 Estimating a Population Proportion
7­3 Estimating a Population Mean: σ Known
7­4 Estimating a Population Mean: σ Not Known
7­5 Estimating a Population Variance
Excel program for critical values and for confidence interval for chi­square distribution (standard deviation and variance) ­ http://cfcc.edu/faculty/cmoore/chi­squared.xls
Chi­Square Distribution
Chi­Square Distribution
2
In a normally distributed population with variance σ , assume that we randomly select independent samples of size n and, for each sample, compute the sample variance s2 (which is the square of the sample standard deviation s). The sample statistic (pronounced chi­
square) has a sampling distribution called the chi­
square distribution.
where
n = sample size
s 2 = sample variance
σ2 = population variance
degrees of freedom = n – 1
1
155S7.5_3 Estimating a Population Variance
Properties of the Distribution of the Chi­Square Statistic
1. The chi­square distribution is not symmetric, unlike the normal and Student t distributions.
As the number of degrees of freedom increases, the
distribution becomes more symmetric. March 29, 2011
Properties of the Distribution of the Chi­Square Statistic – cont.
2. The values of chi­square can be zero or positive, but they cannot be negative.
3. The chi­square distribution is different for each number of degrees of freedom, which is df = n – 1. As the number of degrees of freedom increases, the chi­square distribution approaches a normal distribution.
In Table A­4, each critical value of corresponds to an area given in the top row of the table, and that area represents the cumulative area located to the right of the critical value.
Chi­Square Distribution
Chi­Square Distribution for df = 10 and df = 20
Example
Example
Critical Values of the Chi­Square A simple random sample of ten voltage levels is obtained. Construction of a confidence interval for the population standard deviation σ requires the left and right critical values of corresponding to a confidence level of 95% and a sample size of n = 10. Find the critical value of separating an area of 0.025 in the left tail, and find the critical value of separating an area of 0.025 in the right tail.
2
155S7.5_3 Estimating a Population Variance
Example
For a sample of 10 values taken from a normally distributed population, the chi­square statistic = (n – 1)s2/σ2 has a 0.95 probability of falling between the chi­square critical values of 2.700 and 19.023. Instead of using Table A­4, technology (such as STATDISK, Excel, and Minitab) can be used to find critical values of . A major advantage of technology is that it can be used for any number of degrees of freedom and any confidence level, not just the limited choices included in Table A­4. March 29, 2011
Estimators of σ 2
The sample variance s2 is the best point estimate of the population variance σ2.
Estimators of σ
The sample standard deviation s is a commonly used point estimate of σ (even though it is a biased estimate).
Confidence Interval for Estimating a Population Standard Deviation or Variance
σ = population standard deviation
s = sample standard deviation
n = number of sample values
= left­tailed critical value of Confidence Interval for Estimating a Population Standard Deviation or Variance
Requirements:
1. The sample is a simple random sample. 2. The population must have normally distributed values (even if the sample is large).
σ 2 = population variance
s 2 = sample variance
E = margin of error
= right­tailed critical value of 3
155S7.5_3 Estimating a Population Variance
Confidence Interval for Estimating a Population Standard Deviation or Variance
Confidence Interval for the Population Variance σ 2
Procedure for Constructing a Confidence Interval for σ or σ 2
1. Verify that the required assumptions are satisfied.
2. Using n – 1 degrees of freedom, refer to Table A­4 or use technology to find the critical values and that correspond to the desired confidence level.
3. Evaluate the upper and lower confidence interval limits using this format of the confidence interval:
March 29, 2011
Confidence Interval for Estimating a Population Standard Deviation or Variance
Confidence Interval for the Population Standard Deviation σ
Procedure for Constructing a Confidence Interval for σ or σ 2 ­ cont
4. If a confidence interval estimate of σ is desired, take the square root of the upper and lower confidence interval limits and change σ 2 to σ.
5. Round the resulting confidence level limits. If using the original set of data to construct a confidence interval, round the confidence interval limits to one more decimal place than is used for the original set of data. If using the sample standard deviation or variance, round the confidence interval limits to the same number of decimals places. 4
155S7.5_3 Estimating a Population Variance
Confidence Intervals for Comparing Data
Caution
Confidence intervals can be used informally to compare the variation in different data sets, but the overlapping of confidence intervals should not be used for making formal and final conclusions about equality of variances or standard deviations. March 29, 2011
Example:
The proper operation of typical home appliances requires voltage levels that do not vary much. Listed below are ten voltage levels (in volts) recorded in the author’s home on ten different days. These ten values have a standard deviation of s = 0.15 volt. Use the sample data to construct a 95% confidence interval estimate of the standard deviation of all voltage levels.
123.3 123.5 123.7 123.4 123.6 123.5 123.5 123.4 123.6 123.8
Example:
Example:
Requirements are satisfied: simple random sample and normality
n = 10 so df = 10 – 1 = 9. Use table A­4 to find:
Construct the confidence interval: n = 10, s = 0.15
5
155S7.5_3 Estimating a Population Variance
Example:
Evaluation the preceding expression yields:
Finding the square root of each part (before rounding), then rounding to two decimal places, yields this 95% confidence interval estimate of the population standard deviation:
Based on this result, we have 95% confidence that the limits of 0.10 volt and 0.27 volt contain the true value of σ.
Determining Sample Sizes
March 29, 2011
Determining Sample Sizes
The procedures for finding the sample size necessary to estimate s2 are much more complex than the procedures given earlier for means and proportions. Instead of using very complicated procedures, we will use Table 7­2.
STATDISK also provides sample sizes. With STATDISK, select Analysis, Sample Size Determination, and then Estimate St Dev.
Minitab, Excel, and the TI­83/84 Plus calculator do not provide such sample sizes. Example: We want to estimate the standard deviation s of all voltage levels in a home. We want to be 95% confident that our estimate is within 20% of the true value of σ. How large should the sample be? Assume that the population is normally distributed.
From Table 7­2, we can see that 95% confidence and an error of 20% for σ correspond to a sample of size 48. We should obtain a simple random sample of 48 voltage levels form the population of voltage levels.
6
155S7.5_3 Estimating a Population Variance
Recap
In this section we have discussed:
• The chi­square distribution.
• Using Table A­4.
• Confidence intervals for the population variance and standard deviation.
• Determining sample sizes.
March 29, 2011
384/6. Find the critical values and that correspond to the given confidence level and sample size: 95%; n = 20
You may use the Excel program to find Critical Values or Confidence Intervals for the Chi­Squared Distribution. It is available at http://cfcc.edu/faculty/cmoore/ChiSquared.xls
Use the given confidence level and sample data to find a confidence interval for the population standard deviation In each case, assume that a simple random sample has been selected from a population that has a normal distribution.
384/8. Find the critical values and that correspond to the given confidence level and sample size: 90%; n = 51
384/10. Speeds of Drivers Ticketed in a 65 mi/h Zone on the Massachusetts Turnpike 95% confidence; n = 25, x = 81.0 mi/h, s = 2.3 mi/h.
7
155S7.5_3 Estimating a Population Variance
March 29, 2011
Use the given confidence level and sample data to find a confidence interval for the population standard deviation In each case, assume that a simple random sample has been selected from a population that has a normal distribution.
384/12. Reaction Times of NASCAR Drivers 99% confidence; n = 8, x = 1.24 sec, s = 0.12 sec.
Assume that each sample is a simple random sample obtained from a normally distributed population. Use Table 7­2 on page 382 to find the indicated sample size.
384/14. Find the minimum sample size needed to be 95% confident that the sample standard deviation s is within 30% of σ. Is this sample size practical in most applications?
Sample size is n = 21.
Assume that each sample is a simple random sample obtained from a normally distributed population. Use Table 7­2 on page 382 to find the indicated sample size.
384/16. Find the minimum sample size needed to be 95% confident that the sample variance is within 20% of the population variance.
Sample size is n = 211.
8
155S7.5_3 Estimating a Population Variance
March 29, 2011
385/22. Designing Theater Seats In the course of designing theater seats, the sitting 385/20. Pulse Rates of Men and Women Data Set 1 in Appendix B includes 40 pulse rates of men, and those pulse rates have a mean of 69.4 beats per minute and a standard deviation of 11.3 beats per minute. That data set also includes 40 pulse rates of women, and those pulse rates have a mean of 76.3 beats per minute and a standard deviation of 12.5 beats per minute. a. Construct a 99% confidence interval estimate of the standard deviation of the pulse rates of men. b. Construct a 99% confidence interval estimate of the standard deviation of the pulse rates of women. c. Compare the variation of the pulse rates of men and women. Does there appear to be a difference?
heights (in mm) of a simple random sample of adult women is obtained, and the results are listed below (based on anthropometric survey data from Gordon, Churchill, et al.). Use the sample data to construct a 95% confidence interval estimate of σ, the standard deviation of sitting heights of all women. Does the confidence interval contain the value of 35 mm, which is believed to be the standard deviation of sitting heights of women? 849 807 821 859 864 877 772 848 802 807 887 815
If you work this by use of formula, use the TI to find the standard deviation, s = 34.981. Find critical values for 95% confidence level. Use the appropriate formula.
You may use the Excel program shown below. It is available at http://cfcc.edu/faculty/cmoore/chi­squared.xls
(a)
(b)
﴾c﴿ The two intervals are similar. No, there does not appear to be a difference in the variation of pulse rates of men and women.
Yes, the interval contains the traditionally believed value of 35 mm.
386/24. Comparing Waiting Lines a. The listed values are waiting times (in minutes) of customers at the Jefferson Valley Bank, where customers enter a single waiting line that feeds three teller windows. Construct a 95% confidence interval for the population standard deviation σ. 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7
b. The listed values are waiting times (in minutes) of customers at the Bank of Providence, where customers may enter any one of three different lines that have formed at three teller windows. Construct a 95% confidence interval for the population standard deviation σ.
4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0
c. Interpret the results found in parts (a) and (b). Do the confidence intervals suggest a difference in the variation among waiting times? Which arrangement seems better: the single­line system or the multiple­line system?
The variation is considerably higher in part ﴾b﴿. Yes; since the intervals do not overlap, there is a significant difference in the variability of the two systems. The single­line system in part ﴾a﴿ is better for the customers because it eliminates the long wait endured by some customers when one of the lines is slow.
386/24. Comparing Waiting Lines a. The listed values are waiting times (in minutes) of customers at the Jefferson Valley Bank, where customers enter a single waiting line that feeds three teller windows. Construct a 95% confidence interval for the population standard deviation σ. 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7
b. The listed values are waiting times (in minutes) of customers at the Bank of Providence, where customers may enter any one of three different lines that have formed at three teller windows. Construct a 95% confidence interval for the population standard deviation σ.
4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0
c. Interpret the results found in parts (a) and (b). Do the confidence intervals suggest a difference in the variation among waiting times? Which arrangement seems better: the single­line system or the multiple­line system?
See solutions from previous page.
The confidence intervals in parts (a) and (b) were generated using the Excel program available at http://cfcc.edu/faculty/cmoore/chi­
squared.xls
9
Download