Mr. Mark Anthony Garcia, M.S. Mathematics Department De La Salle University Recall: Parameter A parameter is a numerical value associated to a population. Some examples are the population mean 𝜇, population standard deviation 𝜎 , population variance 𝜎 2 and population proportion 𝑝. Recall: Statistic A statistic is a numerical value associated to a sample. Some examples are the sample mean 𝑥, sample standard deviation 𝑠 , sample variance 𝑠 2 and sample proportion 𝑝. Estimation In this chapter, we are concerned in estimating the value of the population parameter using the value of the sample statistic. In particular, we estimate the value of the population proportion 𝑝 using the value of the sample proportion 𝑝. Estimation Estimation is a procedure by which a numerical value or values are assigned to a population parameter based on the information collected from a sample. Examples of using Estimation 1. 2. A political organization may want to estimate the proportion of voters that will vote for their presidential candidate. A TV network may want to estimate the proportion of TV viewers who watch a particular primetime show. Situation: Estimation Suppose that a political analyst wants to estimate the percentage or proportion of voters that will vote for senatorial candidate A in city X for the upcoming senatorial elections. The population of study of the political analyst is the set of all voters in city X. Suppose that the population size is 𝑁 = 1000 voters. Situation: Estimation The political analyst used simple random sampling to obtain a sample of 𝑛 = 50 voters and asked these voters if they will vote for senatorial candidate A or not. The survey result showed that 34 out of the 50 voters will vote for senatorial candidate A. Situation: Estimation The result showed that the sample 34 proportion 𝑝 = = 0.68 or 68% of the 50 sample will vote for senatorial candidate A. However, the result is obtained from a sample of 𝑛 = 50 voters. The sample that the political analyst obtained is only one possible sample of 50 voters out of many different sample of 50 voters. Situation: Estimation If the political analyst takes another sample of 50 voters (common voters from the previous sample may be included in the new sample), he will get a different sample proportion or percentage. Situation: Estimation The problem of the political analyst is to estimate the value of the population proportion 𝑝 using the different sample proportions obtained from different samples. The question is, what percentage of the 𝑁 = 1000 voters in city X will vote for senatorial candidate A for the upcoming elections? Situation: Estimation To answer the problem of the political analyst, we use interval estimation of parameters. But before we reveal this topic, let us differentiate point estimation from interval estimation. Point Estimation Point estimation is the procedure of assigning a single value to the population parameter. Normally, we use the value of the sample statistic as point estimate for the population parameter. Disadvantage: We can have a lot of point estimates since different samples may be obtained from the population. Example: Point Estimation In the situation of the political analyst, we 34 can use the sample proportion 𝑝 = = 50 0.68 or 68% as a point estimate for the population proportion 𝑝. Interval Estimation Interval estimation is the procedure of assigning multiple values to a population parameter. It is a procedure of obtaining an interval where the population parameter coincides. Interval Estimation The interval is dependent on the point estimate, standard error and maximum error. Each interval is constructed with regard to a given confidence level and is called a confidence interval. Confidence Level The confidence level attached to a confidence interval is viewed as the probability that the population parameter lies inside the confidence interval. It is of the form 1 − 𝛼 100% . If the confidence level is 95%, then it is of the form 1 − 0.05 100% = 0.95 100% = 95%. This means that 𝛼 = 0.05. Confidence Level The value of 𝛼 is viewed as the probability that the population parameter is not inside the confidence interval. Now, let us interpret the 95% confidence level using the political analyst situation. Confidence Level Suppose that the political analyst obtained 100 samples of 50 voters each. Then 95 out of the 100 samples will give us a sample proportion that is inside the confidence interval estimate. Interval Estimation of Proportion 𝑥 𝑛 The sample proportion is given by 𝑝 = where 𝑥 is the number of successes in the sample and 𝑛 is the sample size. The standard error is given by the formula 𝑆𝐸 = 𝑝𝑞 𝑛 where 𝑞 = 1 − 𝑝. This can be obtained from the PhStat output. Interval Estimation of Proportion The formula for the confidence interval estimate for the population proportion 𝑝 is sample proportion ± maximum error. We use minus for the lower bound and plus for the upper bound. Interval Estimation of Proportion Interval Estimation of Proportion Thus, we are 95% confident that the proportion of voters that will vote for senatorial candidate A is between the interval [0.5507, 0.8093] or between 55.07% and 80.93%. Example 1: Interval Estimation In a random sample of 500 people eating lunch at a hospital cafeteria on various Fridays, it was found that 160 preferred seafood. A. Find a 95% confidence interval for the actual proportion of people who prefer seafood on Fridays at this cafeteria. B. How large a sample is required if we want to be 95% confident that our estimate is within 0.02? Example 1: Interval Estimation A. Find a 95% confidence interval for the actual proportion of people who prefer seafood on Fridays at this cafeteria. Given: 𝑝 = is 95%. 160 500 = 0.32, confidence level Example 1: Interval Estimation Data Sample Size 500 Number of Successes 160 Confidence Level 95% |𝒛𝜶 | = 𝟏. 𝟗𝟔 𝟐 Intermediate Calculations Sample Proportion Z Value 0.32 -1.959963985 Standard Error of the Proportion 0.020861448 Interval Half Width 0.040887686 Confidence Interval Maximum error Interval Lower Limit 0.279112314 Interval Upper Limit 0.360887686 Example 1: Interval Estimation We are 95% confident that the proportion of people eating lunch at a hospital cafeteria on various Fridays who prefer seafood is between [0.2791,0.3609] or between 27.91% and 36.09% Sample Size Determination If the sample proportion 𝑝 is used as an estimate for 𝑝, then we can be (1-α)100% confident that the error will not exceed a specified maximum error 𝑒 when the sample size is 𝑛= 𝑧𝛼 2 𝑒 𝑝𝑞 2 Example 1: Interval Estimation B. o o How large a sample is required if we want to be 95% confident that our estimate is within 0.02? The maximum error 𝑒 = 0.02 is given. The problem is interpreted as the number of samples to be considered so that the error will not exceed 0.02. Example 1: Interval Estimation Data Estimate of True Proportion 0.32 Sampling Error 0.02 Confidence Level 95% Intermediate Calculations Z Value -1.95996398 Calculated Sample Size 2089.753598 Result Sample Size Needed 2090 Example 2: Interval Estimation In a random sample of 1000 homes in a certain city it is found that 228 are heated by oil. Find a 99% confidence interval for the proportion of homes in this city that are heated by oil. Example 2: Interval Estimation Data Sample Size 1000 Number of Successes 228 Confidence Level 99% Intermediate Calculations Sample Proportion Z Value 0.228 -2.5758293 Standard Error of the Proportion 0.013267102 Interval Half Width 0.034173791 Example 2: Interval Estimation The lower and upper bounds are not given. Thus, we use the formula sample proportion ± maximum error. In the output, the maximum error is the interval half width. Hence, we have sample proportion ± interval half width. Example 2: Interval Estimation Thus, we get 0.228 ± 0.034173791. Therefore, we are 99% confident that the proportion of all homes in that city that are heated by oil is between [0.1938,0.2622] or between 19.38% or 26.22%