Sections 7.1 and 7.2 Point estimate: a single number Interval estimate: interval of numbers. This chapter presents the beginning of inferential statistics. The two major applications of inferential statistics Estimate Test a population parameter: proportion, mean some claim (or hypothesis) about a population. Point Estimate Confidence Interval p= Why?: point estimate is not reliable under rere-sampling. A confidence interval (CI): an interval of values used to estimate the true population parameter. pˆ = nx (pronounced ‘p-hat’) population proportion sample proportion of x successes in a sample of size n. Unbiased estimate (best estimate) qˆ = 1 - pˆ = sample proportion of failures in a sample size of n Confidence Level Example: PhotoPhoto-Cop Survey Responses 829 adult Minnesotans were surveyed, and 51% of them are opposed to the use of the photophoto-cop for issuing traffic tickets. Using these survey results, find the best estimate of the proportion of all adult Minnesotans opposed to photophotocop use. Best point estimate=sample proportion=51%. α: between 0 and 1 A confidence level: 1 - α or 100(1100(1- α)%. E.g. 95%. This is the proportion of times that the confidence interval actually does contain the population parameter, assuming that the estimation process is repeated a large number of times. Other names: degree of confidence or the confidence coefficient. coefficient. 1 The Critical Value (z(z-score) Finding zα/2 100(1- α)% Confidence Level α/2 for 100(1- Given α α =5% α/2 α/ = 2.5% = .025 Margin of Error of ^ p Sampling Distribution of ^ p The sampling distribution of sample proportion can be approximated by a normal distribution if np≥ np≥15 and nq ≥15 : phat is approximately N(p, pq/n), pq/n), q=1q=1-p. p z= the maximum likely (with probability 1 – α) difference between the observed proportion ^ and the true population proportion p. p pˆ − p pˆ qˆ n E = zα / 2 ˆp q̂ n ^ Standard Error of p =se ^ p p Finding the 95% Confidence Interval for a Population Proportion A 95% confidence interval for a population proportion p is: p̂ ± 1.96(se), with se = with p̂(1 - p̂) n 100(1100(1-α)% confidence interval for p is pˆ ± zα / 2 ( se) Example: Would You Pay Higher Prices to Protect the Environment? se = Of n = 1154 respondents, 518 were willing to do so pˆ (1 − pˆ ) n In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?” environment?” Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey 2 Example: Would You Pay Higher Prices to Protect the Environment? What is the Error Probability for the Confidence Interval Method? 518 = 0.45 1154 (0.45)(0.55) se = = 0.015 1154 E = 1.96(se) = 1.96(0.015) = 0.03 p̂ ± E = 0.45 ± 0.03 = (0.42, 0.48) p̂ = Summary: Effects of Confidence Level and Sample Size on Margin of Error The margin of error for a confidence interval: Increases as the confidence level increases Decreases as the sample size increases Determining Sample Size Recall : E= zα / 2 pˆ qˆ n (solve for n by algebra) 2 n = ( zα /E22) pˆ qˆ Sample Size for Estimating Proportion p ˆ When an estimate p of p is known: 2 n = ( zα / E2 )2 pˆ qˆ When no estimate of p is known: Example: Example: Suppose a sociologist wants to determine the current percentage of U.S. households using ee-mail. How many households must be surveyed in order to be 95% confident that the sample percentage is in error by no more than four percentage points? points? a) Use this result from an earlier study: In 1997, 16.9% of U.S. U.S. households used ee-mail (based on data from The World Almanac and Book of Facts). b) Assume that we have no prior information suggesting a possible possible value of p. n = ( zα / E2)2 2 0.25 3 b) Assume that we have no prior information suggesting a possible possible value of p. a) Use this result from an earlier study: In 1997, 16.9% of U.S. U.S. households used ee-mail (based on data from The World Almanac and Book of Facts). n = [za/2 ]2ˆpˆq E2 = [1.96]2 (0.169)(0.831) 0.042 = 337.194 = 338 households To be 95% confident that our sample percentage is within four percentage points of the true percentage for all households, we should randomly select and survey 338 households. n = [za/2 ]2 • 0.25 E2 = (1.96)2 (0.25) 0.042 = 600.25 = 601 households With no prior information, we need a larger sample to achieve the same results with 95% confidence and an error of no more than 4%. Finding the Point Estimate and E from a Confidence Interval Point estimate of p: ˆ p = (upper confidence limit) + (lower confidence limit) 2 Margin of Error: E = (upper confidence limit) — (lower confidence limit) 2 4