Last Name___________________ First Name _________________Class Time________Chapter 8-1 Chapter 8: Confidence Intervals Parameters are calculations based on using data from the entire population. As a result, it is RARE to be able to successfully calculate a parameter. When we don't know the value of a parameter, we need an approach to estimate the parameter. The best single number estimate for a parameter is its corresponding sample statistic. We can ALWAYS calculate values for statistics using sample data. Symbol µ Statistic (Best single point estimate) Sample Mean σ Sample Standard Deviation s p Sample Proportion p Parameter Symbol Population Mean Population Standard Deviation Population Proportion x Note: ALL of the symbols for both population parameters and sample statistics are lower-case letters. They are ALL values of a single number. Upper-case letters are used to represent random variables. But, rather than hope that the single value of a sample statistic is a good estimate of a population parameter, it is safer to establish a range of values to estimate a population parameter. This range of values is called a confidence interval for the parameter. A CONFIDENCE INTERVAL IS A RANGE OF VALUES THAT IS LIKELY TO CONTAIN THE TRUE VALUE OF THE POPULATION PARAMETER. THE LEVEL OF CONFIDENCE IS THE PROBABILITY 1 - α (OR EXPRESSED IN PERCENT AS 100(1 - Α )%) THAT A CONFIDENCE INTERVAL CONTAINS A POPULATION PARAMETER. We calculate CONFIDENCE INTERVALS ONLY for PARAMETERS, the values that we can’t calculate because we don’t have access to all of the data. In particular, in this course, we will only calculate confidence intervals for 2 parameters: population means and population proportions. A Confidence Interval (CI) consists of a range of numerical values which we believe will include the true value of a population parameter with a specified level of confidence. Last Name___________________ First Name _________________Class Time________Chapter 8-2 The confidence interval for a parameter is: (Best Point Estimate – Error Bound, Best Point Estimate + Error Bound) The confidence interval for µ is ( x - EBM, x + EBM). The confidence interval for p is (p - EBP, p + EBP). In chapter 8, we will be constructing confidence intervals for parameters in three cases: Case I. For the population proportion p. Case II. For the population mean µ when the population standard deviation σ is known Case III. For the population mean µ when the population standard deviation σ is unknown Confidence Intervals are created ONLY for population parameters such as µ or p. Typically, we do not know the value of a population parameter because it's almost always impossible to accurately obtain data from an entire population. Confidence Intervals are NEVER created for sample statistics such as x or p'. We don’t need a confidence interval for a statistic. We can always calculate the exact value of a sample statistic by using sample data. Last Name___________________ First Name _________________Class Time________Chapter 8-3 Case I: Construct A Confidence Interval For p: We can construct a confidence interval for the population average, p , by building on our knowledge of the random variable P' for sample proportions. The random variable P' is the basis for constructing a confidence interval for p. p = The sample proportion = x number of successes in sample n total number in sample RECALL THAT: X is a binomial random variable with X ~ B ( n, p )! But if n is large enough, then we can use a normal distribution to approximate the distribution of X. and X ~ N ( np, npq )! We can divide by n and use algebra to find the distribution of P . And But, we don’t know p. Our best point estimate for p is p. So we use: pq P ~ N p , n P ~ N p' , p'q' n . Confidence Interval for a Population Proportion We want to estimate the population proportion p. The confidence interval for p is: (Best Point Estimate – Error Bound For The Proportion, Best Point Estimate + Error Bound For The Proportion) = (p - EBP, p + EBP ). (The Error Bound For The Proportion…. is also called……the Margin Of Error.) (The confidence interval for the population proportion is based on the binomial probability of success) Error Bound For The Proportion: EBP = Z 2 p' q ' n where q' = 1 - p', and n = sample size p' q ' = the standard deviation for the sample proportion P'. n Z = the upper Z value that bounds the middle area equal to the confidence level (1 - α ). 2 OR Z is the lower cut-off value of the top α / 2 area of the Z distribution. 2 Confidence Level = 1 – α (In Chapter 9, we focus on α, the level of significance.) Confidence Interval for p' = ( x - EBP, x + EBP ) = ( x - Z 2 p' q ' , n x - Z 2 p' q ' ) n Last Name___________________ First Name _________________Class Time________Chapter 8-4 Case II: Construct A Confidence Interval For When Is Known: We can construct a confidence interval for the population average , when we know the population standard deviation , by building on our knowledge of the distribution of the random variable The random variable x for sample averages. x is the basis for constructing a confidence interval for when is known. From Chapter 7, we know that if the sample is large enough, then But, we don’t know . Our best point estimate for is x . So we use: x ~ N , n . x ~ N x , . n Confidence Interval For The Population Mean when is known: We want to estimate the population average and we know the population standard deviation . The confidence interval for µ is: (Best Point Estimate – Error Bound For The Mean, Best Point Estimate + Error Bound For The Mean) = ( x - EBM, x + EBM ). (The Error Bound For The Mean…… is also called……the Margin Of Error.) x = the average of data from the sample = ( ∑ x’s from the sample) / n ) Error Bound For The Mean: EBM = Z ; 2 n where n = sample size. = the standard deviation (or standard error) of n Z = the upper Z value that bounds the middle area under the Z distribution equal to the confidence level (CL). 2 OR Z is the lower cut-off value of the top α / 2 area of the Z distribution. 2 Confidence Level = 1 – α (In Chapter 9, we will focus on α, the level of significance.) Confidence Interval for µ = ( x - EBM, x + EBM ) = ( x - Z , x - Z ) 2 n 2 n x. Last Name___________________ First Name _________________Class Time________Chapter 8-5 Case III: Construct A Confidence Interval For When Is Unknown: We can construct a confidence interval for the population average , when we do not know the population standard deviation , by introducing a new distribution, called the t-distribution for the random variable sample averages. x for The t-distribution (or Student-t distribution) has several specific characteristics: The mean value is zero (like the standard normal random variable, Z). The t random variable can have any value between -∞ and +∞ (like Z). The distribution is bell-shaped, and symmetric about the value zero on the horizontal axis (like Z). the t-distribution will have a lower central peak and higher tails than the standard normal distribution. The parameter of the t distribution is its degrees of freedom (df = n-1). The larger the sample size, the larger the degrees of freedom and the more the particular t-distribution will be like the Z distribution. Confidence Interval for a Mean when is NOT known We want to estimate the population average and we do not know the population standard deviation . We use the sample standard deviation, s, to estimate the population standard deviation . The confidence interval for µ is: (Best Point Estimate – Error Bound For The Mean, Best Point Estimate + Error Bound For The Mean) = ( x - EBM, x + EBM ). (The Error Bound For The Mean…… is also called……the Margin Of Error.) x = the average of data from the sample = ( ∑ x’s from the sample) / n ) ; s Error Bound For The Mean: EBM = t 2 n s = the standard deviation (or standard error) of n x. t = the upper t value that bounds the middle area equal to the confidence level 2 under the student t-distribution with n-1 degrees of freedom (df) t is the lower cut-off value of the top α / 2 area of the t-distribution. 2 Confidence Level = 1 – α Confidence Interval for µ = ( x - EBM, x + EBM ) s = s ( x - t 2 n , x + t 2 n ) n = sample size. Last Name___________________ First Name _________________Class Time________Chapter 8-6 ***********Interpreting the Confidence Interval************ For a proportion, p: 2 types of interpretation: First Interpretation (2 ways): We are ______% confident that the true proportion of the population (describe the population parameter in the problem) is between _____________ and _______________. OR We estimate with _____% confidence that between ___________% and ___________% of the population are successes (describe the population parameter and what a success is in the problem.) Second Interpretation: If we calculate confidence intervals based on repeated sampling of size ___ (fill in the value of n) in the same way, then we expect that ________% of the confidence intervals calculated will contain the true population proportion (describe the population parameter in the problem). For a mean, µ: 2 types of interpretation: First Interpretation (2 ways): We are _____% confident that the true population average (or mean) (describe the population parameter in the situation of this problem) is between _______ and _______ (include the units). OR We estimate with _____% confidence that the true population average (or mean) (describe the population parameter in the situation of this problem) is between ___________ and ___________ of Second Interpretation: If we calculate confidence intervals based on repeated sampling of size ___ (fill in the value of n) in the same way, then we expect that ________% of the confidence intervals calculated will contain the true population mean (describe the population parameter in the problem). Visual Look At The Second Interpretation: For example, a confidence level of 90% means that on average, 90% of all possible confidence intervals based on repeated sampling for a population mean µ in the same way are expected to contain the true population mean, µ. In the following diagram, the ten confidence interval lengths, constructed by using ten random samples of the same size, n, are all the same width. And, we expect, on average, that 9 out 10 such confidence intervals will contain the true population mean, µ. Last Name___________________ First Name _________________Class Time________Chapter 8-7 Finding the Point Estimate and Error Bound if we know the Confidence Interval: If we know that the confidence interval is: (lower bound, upper bound), then: • Point Estimate = (lower bound + upper bound)/2, the average of the upper and lower bounds. • Error Bound = Margin of Error = (upper bound – lower bound) / 2, or half of the confidence interval length What does it mean to be CL% confident? If we were to take repeated samples and calculate many confidence interval estimates based on those samples, then we would expect that CL% of the confidence interval estimates would be “good estimates” that would enclose (capture) the true value of the population parameter we are trying to estimate. If we were to take repeated samples and calculate many confidence interval estimates based on those samples, then we would expect that 100% − CL% of the confidence interval estimates would be “bad estimates” that would NOT enclose (capture) the true value of the population parameter we are trying to estimate. Question for discussion: Can you ever know which confidence intervals actually do contain the true population parameter and which intervals do not? Note that the confidence interval is about population proportions or population averages, parameters that are calculated from all data based on the population. The confidence interval is not about individual data values. It does not mean that CL% of the data lie within the confidence interval. To find Z that puts the area equal to the confidence level “in the middle” CL is in the middle. = 1 − CL is the combined area of both tails. α 2 is the area of one tail. To find z = invnorm(1− α , 0, 1) ; 2 α z = - invnorm( , 0, 1). ( z is always positive.) 2 2 2 2 OR EXAMPLE 1: CONFIDENCE INTERVAL ESTIMATE for an unknown POPULATION PROPORTION p A city government needs to determine the percent of its residents that do not have health insurance. The city health department randomly surveys 1600 city residents and finds that 15.25% of the 1600 residents do not have health insurance. Construct and interpret a 95% confidence interval for the true population proportion of all city residents who do not have health insurance. Use a 95% confidence level. a. Define the population parameter: p = b. Define the random variable P = Last Name___________________ First Name _________________Class Time________Chapter 8-8 We are using sample data to estimate an unknown proportion for the whole population. Point Estimate = p α Z = invnorm(1− , 0, 1) EBP = z 2 p' q ' n Confidence Interval = (p - EBP, p + EBP) Confidence Level CL is area in the middle = 1 – Confidence Level 2 2 area to left invnorm , 0, 1 of z so α = area in one tail 2 c. Find the confidence interval by hand, filling in the steps for the following calculations. Estimate answers to 4 decimal places. p = _____________ = ______ , 1 - =________ so = ________ and /2 = __________ p' q ' z = invNorm ( n = ______________ = ________ , 2 EBP = z 2 , , ) = _______ p' q ' n = ___________________ = _________ p - EBP = _______ - ________ = ________ , p + EBP = _______ + ________ = ________ So the 95% confidence interval is: ___________________________________________________ d. Find the confidence interval using your calculator command, 1-PropZInterval. Key in the following sequence : 2nd STAT TESTS 1-PropZInterval Fill in the appropriate values for x, n, and the confidence level in your calculator. Using 1-PropZInterval: the 95% confidence interval is _____________________________________ e. Find the Error Bound For the Proportion using your confidence interval. Show your work. f. Sketch a graph of your confidence interval results. Draw the appropriate curve shape. Label the axis. Label the mean and key points on the axis. Shade the area corresponding to the confidence interval. Label the size of all shaded and unshaded areas. Draw a second axis, label it Z and label the mean. Calculate and label z-scores corresponding to the upper and lower bounds of your CI. Last Name___________________ First Name _________________Class Time________Chapter 8-9 g. Interpret your confidence interval in two ways, in context of the problem. First Interpretation: Second Interpretation: h. The probability distribution N p' , α 2 Using: invnorm , p' , p' q ' can also be used to find the confidence interval. n p' q ' = upper bound and n invnorm 1 α p' q ' = lower bound , p' , 2 n will give the right and left bounds of the same confidence interval. This is because of the property we learned in chapter 6, that for the normal distribution the probabilities found using the given distribution are the same as the probabilities found using the zscores with the standard normal distribution Z~N(0,1). Fill in the blanks below: invnorm(______ , ______ , _______) = ______ and invnorm(______ , ______, _______) = _______ Confidence Interval = ( ______________, ______________ ) k. Using your calculator software to find the confidence interval: STAT → TESTS → 1-PropZInterval → ENTER x: (Fill in the number of successes directly if given, or as a percent of the total sample size.) n: (Fill in the sample size.) C- Level: (Fill in the level of confidence as a decimal.) Calculate ENTER Fill in the blanks, showing any necessary calculations: x: _____________________ and n: _____________ C- Level: ______________ the Confidence Interval = ( ______________, ______________ ) Last Name___________________ First Name _________________Class Time________Chapter 8-10 i. It has been estimated that for the state in which that city is located, approximately 20% of residents do not have health insurance. Based on the confidence interval you found above (in part h), can we conclude that the proportion of city residents who lack health insurance is lower than the proportion of state residents? Explain. j. It has been estimated that nationally, approximately 16% of residents do not have health insurance. Based on this confidence interval, can we conclude that the proportion of city residents who lack health insurance is lower than the proportion of U.S. residents? Explain. CONFIDENCE INTERVAL ESTIMATE for unknown POPULATION MEAN when the POPULATION STANDARD DEVIATION is KNOWN EXAMPLE 2: a. A soda bottling plant fills 12 ounce cans with soda. The filling machine varies and does not fill each can with exactly 12 ounces. To determine if the filling machine needs adjustment, each day the quality control manager measures the amount of soda per can for a random sample of 50 cans. Experience shows that its filling machines have a known (population) standard deviation of 0.35 ounces. In today's sample of 50 cans of soda, the average amount of soda per can is 12.1 ounces with a standard deviation of 0.42 ounces. Construct and interpret a 95% confidence interval estimate for the true population average amount of soda contained in all cans filled today at this bottling plant. a. X = b. population parameter: = c. random variable x = We are using sample data to estimate an unknown mean (average) for the whole population. Confidence Level CL is α , 0, 1) Confidence = invnorm(1− Z area in the middle 2 2 Interval = = 1 – Confidence Level Invnorm (area to left of z, 0, 1) ( EBM, + EBM) x x n so α = area in one tail EBM = Point Estimate = z x 2 2 Last Name___________________ First Name _________________Class Time________Chapter 8-11 d. Find the confidence interval by hand, filling in the steps for the following calculations. Estimate answers to 4 decimal places. Show ALL WORK below. Include both symbols and numerical values of all pieces of your work. x = _____________ = ______ , 1 - =________ so = ________ and /2 = __________ = ______________ = ________ , z 2 = invNorm ( n , , ) = _______ EBM = z • = ___________________ = _________ 2 n x - EBM = _______ - ________ = ________ , x + EBM = _______ + ________ = ________ So the 95% confidence interval is: ___________________________________________________ e. Sketch a graph of your confidence interval results. Draw the appropriate curve shape. Label the axis. Label the mean and key points on the axis. Shade the area corresponding to the confidence interval. Label the size of all shaded and unshaded areas. Draw a second axis, label it Z and label the mean. Calculate and label z-scores corresponding to the upper and lower bounds of your CI. f. As in the previous example, the probability distribution N x , can also be used to find the n confidence interval. α α Using: invnorm , x , invnorm 1 , x , = upper bound and = lower bound n 2 2 n will give the right and left bounds of the same confidence interval. invnorm(____ , _____ , _______ ) = _______ and invnorm(____ , _____, _______ ) = _______ Confidence Interval ( ________, ________ ) Last Name___________________ First Name _________________Class Time________Chapter 8-12 g. Interpret your confidence interval in two ways, in context of the problem. First Interpretation: Second Interpretation: h. Using your calculator software to find the confidence interval: STAT → TESTS → ZInterval → ENTER Input: Data Stats Choose Data if you are providing a list of data values (L1) with a frequency (L2 or 1). The calculator will use your data to find the sample mean. Choose Stats if you know x and n. σ: x n C- Level: Fill in the necessary values. Fill in the blanks, showing any necessary calculations: σ: _________ and x = _________ n: _____________ C- Level: ______________ the Confidence Interval = ( ______________, ______________ ) EXAMPLE 3: CONFIDENCE INTERVAL ESTIMATE for unknown POPULATION MEAN when the POPULATION STANDARD DEVIATION is NOT KNOWN a. The speeds of 20 vehicles are observed by radar on a particular road. For the vehicles in the sample, the average speed is 31.3 miles per hour with a standard deviation 7.0 mph. Construct and interpret a confidence interval estimate of the true population average speed of all vehicles traveling on this road. Use a 90% confidence level. X = ________________________________________________________________________ population parameter: = ________________________________________________________________ random variable x = ____________________________________________________________________ Last Name___________________ First Name _________________Class Time________Chapter 8-13 b. Using your calculator software to find the confidence interval: STAT → TESTS → TInterval → ENTER Input: Data Stats Choose Data if you are providing a list of data values (L1) with a frequency (L2 or 1). The calculator will use your data to find the sample mean. Choose Stats if you know x , sx , and n. x: sx : n: C- Level: Fill in the necessary values. Fill in the blanks, showing any necessary calculations: sx: _________ and x = _________ n: _____________ C- Level: ______________ the Confidence Interval = ( ______________, ______________ ) c. Sketch a graph of your confidence interval results. Draw the appropriate curve shape. Label the axis. Label the mean and key points on the axis. Shade the area corresponding to the confidence interval. Label the size of all shaded and unshaded areas. Calculate and label z-scores corresponding to the upper and lower bounds of your CI. d. Interpret your confidence interval in two ways, in context of the problem. First Interpretation: Second Interpretation: Last Name___________________ First Name _________________Class Time________Chapter 8-14 e. Find the Error Bound for the mean, showing all calculations. f. What is the best point estimate of the true average speed of all vehicles on this road? g. In Example 3, suppose that you were not given the sample mean and sample standard deviation and instead you were given a list of data for the speeds (in miles per hour) of the 20 vehicles. 19 19 22 24 25 27 28 37 35 30 37 36 39 40 43 30 31 36 33 35 Use these data to find the 90% confidence interval. Did you find exactly the same interval as you answered to part b above? NOTE: Using the t- distribution requires that the underlying population of individual values is approximately normally distributed. This assumption is somewhat "robust", meaning it can be violated to some degree. But if the underlying population of individual values has a distribution that differs too much from the normal distribution, then this confidence interval method would not be appropriate, and statisticians would use other techniques that we do not study in Math 10. Last Name___________________ First Name _________________Class Time________Chapter 8-15 EXAMPLE 4: If we know the confidence interval: Working Backwards The average nightly cost of hotel rooms for two resort areas are compared. Large random samples of hotel room costs are collected for each city. The resulting confidence intervals are reported in a hotel industry journal. The 90% confidence interval estimate for the true population average nightly cost of a hotel room in Surf City is $134 to $159 per night. The 90% confidence interval estimate for the true population average nightly cost of a hotel room in Ski Village is $123 to $141 per night. a. Find the point estimate for the true average nightly cost of a hotel room in each city. Which city has a higher point estimate? Surf City: Ski Village: Circle the city with the higher point estimate. Surf City Ski Village b. Find the error bound for each city. Which city has a smaller margin of error? Surf City: Circle the city with the smaller error bound. Ski Village: Surf City Ski Village c. Based on the confidence intervals only, would it be reasonable to conclude that the true average nightly cost of a hotel rooms are different in Surf City and in Ski Village? Answer Yes or No and explain why your answer is reasonable. d. Would it be true that 90% of hotel rooms cost between $134and $159 per night in Surf City and that 90% of hotel rooms cost between $123 and $141 per night in Ski Village? Why or why not? Explain! Last Name___________________ First Name _________________Class Time________Chapter 8-16 Exploring Confidence Intervals. Suppose a representative of the dairy industry is interested in the proportion of adults in a certain city who drink milk. He randomly surveys 650 adults in the city and finds that 28% drink milk. a. Write the symbol for and define the parameter. symbol:_____ = _________________________________________________________________ b. Write the symbol for and define the random variable. symbol:_____ = _________________________________________________________________ c. Write the distribution of the random variable: ______ ~ ______________________________ d. Find a 96% confidence interval for the true proportion of all adults in the city when 650 adults are included in the survey. Show your work by hand. Find the error bound for the confidence interval. Draw a well-labeled and shaded graph of the results. Include a second scale for z scores. Error Bound: CI by hand: Sketch: e. Find a 92% confidence interval for the true proportion of all adults in the city if 650 adults were included in the survey. Show your work by hand. Find the error bound for the confidence interval. Draw a well-labeled and shaded graph of the results. Include a second scale for z scores. Error Bound: CI by hand: Sketch: Last Name___________________ First Name _________________Class Time________Chapter 8-17 f. Find a 99% confidence interval for the true proportion of all adults in the city if 650 adults were included in the survey. Show your work by hand and show the steps when using calculator software. Find the error bound for the confidence interval. CI using calculator Error Bound: g. Find a 96% confidence interval for the true proportion of all adults in the city if 100 adults were included in the survey. Show your work by hand and show the steps when using calculator software. Find the error bound for the confidence interval. CI using calculator Error Bound: h. Find a 96% confidence interval for the true proportion of all adults in the city if 50 adults were included in the survey. Show your work by hand and show the steps when using calculator software. Find the error bound for the confidence interval. CI using calculator Error Bound: Based on the intervals that you calculated above, circle the correct answers to the following questions: i. For a constant sample size, if the confidence level is increased, the confidence interval becomes: wider narrower no change the error bound becomes: larger smaller no change j. For a constant sample size, if the confidence level is decreased, the confidence interval becomes: wider narrower no change the error bound becomes: larger smaller no change k. If the confidence level is held constant and the sample size is increased, the confidence interval becomes: wider narrower no change the error bound becomes: larger smaller no change l. If the confidence level is held constant and the sample size is decreased, the confidence interval becomes: wider narrower no change the error bound becomes: larger smaller no change