SEC. 8.2: A SINGLE POPULATION MEAN USING THE NORMAL DISTRIBUTION Definition: A point estimate is a statistic that provides an estimate of a population parameter (often the mean). Since this is a “best guess” there is going to be a level of error. In this chapter, we’ll learn one method of statistical inference – confidence intervals – so we may estimate the value of a parameter from a sample statistic. THE IDEA OF A CONFIDENCE INTERVAL The big idea : The sampling distribution of x tells us how close to ο the sample mean x is likely to be. All confidence intervals we construct will have a form similar to this : Sample mean ( π) ± error bound (EBM) To get a 90% confidence interval, we must include the central 90% of the probability of the normal distribution. If we include the central 90%, we leave out a total of α = 10% in both tails, or 5% in each tail, of the normal distribution. The confidence level is the overall capture rate if the method is used many times. The sample mean will vary from sample to sample, but when we use the method estimate ± margin of error to get an interval based on each sample, 95% of these intervals capture the unknown population mean µ. πΌ is the probability that the confidence interval does NOT contain π. C.L. + πΌ = 1 INTERPRETING CONFIDENCE LEVEL AND CONFIDENCE INTERVALS Confidence level: To say that we are 95% confident is shorthand for “95% of all possible samples of a given size from this population will result in an interval that captures the unknown parameter.” Confidence interval: To interpret a C% confidence interval for an unknown parameter, say, “We are C% confident that the interval from _____ to _____ captures the actual value of the [population parameter in context].” 90% AND 95% CONFIDENCE LEVELS Which would have a smaller confidence interval? Why? How would we find the bounds of the shaded region? Find percentiles. For 90% confidence, the bounds are the 5th percentile and 95th percentile. For 95% confidence, the bounds are the 2.5th percentile and the 97.5th percentile. EXAMPLE Administrators at a school want to construct a 90% confidence interval for the length of time students on homework per week. It is known that the standard deviation for all high school students is 17 minutes. The school surveys 100 students at random and finds that they spend, on average, 154 minutes per week. a) Define π₯, π, π. b) In words, define X and π. c) Sketch the graph of a 90% confidence interval and find the bounds. d) Find the error bound from the mean EBM (how far each end is from the mean). e) Interpret the meaning of the confidence interval. CALCULATING USING YOUR CALCULATOR When we know the population standard deviation, we can use a calculator to find the confidence interval. Go to STAT, scroll over to TESTS Choose 7:ZINTERVAL If we know the mean, highlight Stats and enter π, π₯, π and the desired confidence level. ANOTHER EXAMPLE The standard deviation of systolic blood pressure is known to be 9.3. A sample of 50 people is taken and it is found that their average blood pressure was 114.9 with a standard deviation of 8. a) Define π₯, π, π. b) In words, define X and π. c) Sketch the graph of a 95% confidence interval and find the confidence interval. d) Find the error bound from mean EBM (how far each end is from the mean). PRACTICE Delays for a train are known to have a standard deviation of 3 minutes. 40 trains are randomly sampled for how long they were delayed. It was found that the sample mean delay was 12 minutes with a standard deviation of 2.5 minutes. a) Define π₯, π, π. b) In words, define X and π. c) Sketch the graph of a 90% confidence interval and find the confidence interval. d) Sketch the graph of a 95% confidence interval and find the confidence interval. e) How are the two confidence intervals different? Which is larger? Why? THE STANDARD NORMAL DISTRIBUTION (ZDISTRIBUTION) This is the same as the normal distribution we have seen before, but has a mean of 0 and standard deviation of 1, so the z-score matches with the x-value. ERROR BOUND FORMULA The error bound formula for the population mean π when the population standard deviation π is known is: Where n is the sample size and π§πΌ is the z-score with the property that 2 πΌ 2 the area to the right of the z-score is . What if we know what error bound we want, but not how many things we need to sample? FINDING A SAMPLE SIZE FOR A PARTICULAR CONFIDENCE LEVEL Use: Where z is the same as π§πΌ . 2 Example: High school students who take the SAT Math exam a second time generally score higher than on their first try. Past data suggest that the score increase has a standard deviation of about 50 points. How large a sample of students would be needed to estimate the mean change in SAT score to within 2 points with 95% confidence? PROPERTIES OF CONFIDENCE INTERVALS: ο§The user chooses the confidence level, and the error bound follows from this choice. ο§ The error bound depends on the confidence level and the sampling distribution of the statistic. ο§ Greater confidence requires a larger error bound ο§ The standard deviation of the statistic depends on the sample size n The margin of error gets smaller when: οΌ The confidence level decreases οΌ The sample size n increases SEC. 8.3: A SINGLE POPULATION MEAN USING THE STUDENT T DISTRIBUTION We talked yesterday of using the Z-distribution (normal) to find confidence interval. To use this, we need to know the population standard deviation or have a large sample size (30 or more) so the sample standard deviation approximates the population standard deviation. What if we don’t have this??? THE STUDENT T-DISTRIBUTION If we have a roughly normal distribution, but don’t know π, we can use the student tdistribution to calculate confidence intervals. The shape of the curve, depends on n; the larger n, the closer it is to the normal distribution DEGREES OF FREEDOM When using the student t-distribution, we use the degrees of freedom, n-1, to characterize the shape. EXAMPLE: A study of commuting times reports the travel times to work of a random sample of 20 employed adults in New York State. The mean is 31.25 minutes and the standard deviation is 21.88 minutes. a) π₯= __________ sx = __________ n = __________ n – 1 = __________ b) Define the random variables X and π in words. c) Which distribution should you use for this problem? d) Construct a 95% confidence interval for the population mean. State the confidence interval. e) Sketch the graph. f) Calculate the error bound. g) Explain in a complete sentence what the confidence interval means. ANOTHER EXAMPLE (FUEL EFFICIENCY) Computers on cars track their fuel efficiency, or miles per gallon(mpg). Following are the mpg values for 20 different cars. 15.8 13.6 19.1 22.4 15.6 22.5 17.2 19.4 22.6 15.6 19.4 18.0 14.6 19.7 21.0 14.8 22.6 21.5 14.3 20.9 a) π₯= ______ sx = _____ n = ______ n – 1 = ______ b) Define the random variables X and π in words. c) Which distribution should you use for this problem? d) Construct a 95% confidence interval for the population mean. State the confidence interval. e) Sketch the graph. f) Calculate the error bound. g) Explain in a complete sentence what the confidence interval means. PRACTICE The body temperature of 130 patients is recorded and the mean of the sample is 98.429 with a standard deviation of 0.733. a) π₯= __________ sx = __________ n = __________ n – 1 = __________ b) Define the random variables X and π in words. c) Which distribution should you use for this problem? d) Construct a 98% confidence interval for the population mean. State the confidence interval. e) Sketch the graph. f) Calculate the error bound. g) Explain in a complete sentence what the confidence interval means. SEC. 8.4: A POPULATION PROPORTION During an election year, we see articles in the newspaper that state confidence intervals in terms of proportions or percentages. The procedure to find the confidence interval, the sample size, the error bound, and the confidence level for a proportion is similar to that for the population mean, but the formulas are different. HOW DO YOU KNOW YOU ARE DEALING WITH A PROPORTION PROBLEM? THE UNDERLYING DISTRIBUTION IS A BINOMIAL DISTRIBUTION. If X is a binomial random variable, then X ~ B(n, p) where n is the number of trials and p is the probability of a success. To form a proportion, take X, the random variable for the number of successes and divide it by n, the number of trials (or the sample size). π The random variable P′ (read "P prime") is that proportion, P′= . π p′ = π₯ π p′ = the estimated proportion of successes (p′ is a point estimate for p, the true proportion.) x = the number of successes n = the size of the sample USING YOUR CALCULATOR In the STATS menu under tests, is the option 1-Prop-Zinterval, which uses the normal distribution to estimate the confidence interval given sample data. Enter x = number of successes (make sure it is a whole number) n = number of trials C-Level = desired confidence level EXAMPLE: VOTERS A random poll of 600 registered voters asked how many had voted in the last election. 330 responded that they had voted. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of people who voted in the last election and find the error bound. X = 330 n = 600 C-level = 0.95 Confidence interval = (0.51, 0.59) π = 0.55 (represents the point estimate) Error bound = 0.59 – 0.55 = 0.04 EXAMPLE: OWNING CARS A poll of city residents finds that 87% of those asked own a car. The poll asked 990 residents. Create a 90% confidence interval for the true percentage of residents that own cars and find the error bound. ESTIMATING THE SAMPLE SIZE NEEDED The formula: provides the number of participants needed to estimate the population proportion with confidence 1 - α and margin of error EBP. Example: How many people should be sampled to determine within 2% the true proportion of voters who will vote for a particular candidate in the election with 95% confidence? This PowerPoint file is copyright 2011-2015, Rice University. All Rights Reserved.