251param 04/28/06 (Open this document in 'Outline' view!) O. Estimation of Parameters. 1. Point and Interval Estimation. Properties of Estimators. Let be an estimator for . a. Unbiassedness E ˆ . b. Consistency (As sample size gets larger, estimate gets better.). c. Efficiency ( ˆ has a small variance). Define BLUE. d. Maximum Likelihood ( ˆ is the value of that is most likely to have produced the observed data) 2. A Confidence Interval for When is Known. x z 2 x You can only use this when you know the population variance. Don’t forget that there are two formulas for the standard error x depending on sample size! An interval of this type is used in two situations: (i) where the population variance, 2 , is, in fact, known and the sample size is relatively large; or (ii) where the variance is not known and the sample variance, s 2 , is used to replace 2 , but the degrees of freedom n 1 are so large that the appropriate value of t n 1 is not very different from z . The first of these situations is not very realistic, but serves as a good introduction to confidence intervals. The formula for this type of confidence interval for the mean is, x z x , where the standard deviation of the sample mean, called the standard error is x 2 n . N n n N 1 ( n is sample size and N is population size) Example 1: Assume that a population is Normally distributed with an unknown mean and a population standard deviation of 36. x ~ N ?,36 From a random sample of size n 9 , we get a sample mean of 62. (Because the population variance is known we can ignore any sample variance we might compute from the data.). Find a 95% confidence interval for the mean. Step 1: State the confidence level and significance level. The given confidence level of 95% represents the probability that the interval actually contains the mean and is stated as 1 .95. The significance level of 5% represents the probability of being wrong and is .05. Step 2: Find the appropriate value of z. Use the last line of Table 18 (or Table 17) in the Syllabus Supplement to find z z.025 1.960 (the bottom number in the .025 column). Note Note: If n .05 N , use x 2 z. , and thus larger confidence intervals. 36 Step 3: Find the standard error. x 12. Note that larger values of n make n 9 that higher confidence levels give larger values of the standard error and the confidence interval smaller. Step 4: Put it together. x z 2 x 62 1.96012 62 23.52 . The last part of this expression means that the interval extends from 62 – 23.52 = 38.48 to 62 + 23.52 = 85.52. The result can be written P 38.48 85.52 .95. Example 2: Assume that a population is Normally distributed with an unknown mean and a population standard deviation of 36. x ~ N ?,36 From a random sample of size n 9 , we get a sample mean of 62. (Because the population variance is known we can ignore any sample variance we might compute from the data.). This time find a 99% confidence interval for the mean. Step 1: State the confidence level and significance level. The given confidence level of 99% represents the probability that the interval actually contains the mean and is stated as 1 .99. The significance level of 1% represents the probability of being wrong and is .01. Step 2: Find the appropriate value of z. Use the last line of Table 18 (or Table 17) in the z 2 z.005 2.576 (the bottom number in the .005 column). Note that higher confidence levels give larger values of z. , and thus larger confidence intervals. 36 Step 3: Find the standard error. x 12. No change from example 1. n 9 Step 4: Put it together. x z 2 x 62 2.57612 62 30.91 . The result can be Syllabus Supplement to find written P 31.09 92.91 .99. Or make a Normal curve with 62 in the middle and 31.09 and 92.91 on the sides. Label the area between 31.09 and 92.91 with 99%, the area below 31.09 with 0.5% and the area above 92.91 with 0.5%. Definitions: Note that if we are considering the possibility that the population mean is 50, we can now say that since this value is on the confidence interval, and since it is on the interval, we can say that the mean is not significantly different from 50. However the mean is significantly different from 20 or 100. Remember that a confidence level is the probability that a given confidence interval is correct. The usual interpretation of a confidence level of 95% is that if we take samples of n items and use the methods given here many times, 95% of the time the interval will contain the population mean. The significance level is the probability that the interval will not contain the population mean. If we say that the population mean is significantly different from 20 and our significance level is 5%, we are saying that there is a probability of 5% or less that the observed data could have been generated by a distribution with a population mean of 20. 3. A Confidence Interval for When is not known. x tn1 s x This is what you actually use most of the time! All that " unknown" means is that we do 2 not have a value of the population variance. If you only have the sample variance, use the t table. Finding degrees of freedom is easy. In most of the problems that we do the numbers of degrees of freedom is one less than the sample size or n 1 . The value of t that you need should be in Table 18 in the Syllabus Supplement. Each row represents the number of degrees of freedom given by the ‘df’ column. It is a good idea to take a ruler and put a line across the table after every 10 th row. Note that the table skips values after 100 degrees of freedom, but a good guess is always possible, for example t.110 05 1.659. "The variance is not known " implies no previous knowledge or assumption about the value of the population variance , 2 . Knowing the sample variance, s 2 is having a good guess as to what the variance is; it is not the same as knowing the variance. If the population distribution is normal or approximately normal, the formula for a two-sided confidence interval for the mean is x tn1s x , 2 where s x s . Note: If n .05 N , use s x s N n . Be careful – It is a common error to think N 1 n n that a new population size is actually a sample size. Note: this is the more common case – if you do not know the population variance and the sample size is not very large, using z instead of t is a very bad idea. Example 1: We have a random sample of 10 homes. The sample mean of expenditures on maintenance is $838 with a sample standard deviation of $110. Construct a 95% confidence interval for the mean. Step1: State the confidence level and significance level. The given confidence level of 95% represents the probability that the interval actually contains the mean and is stated as 1 .95. The significance level of 5% represents the probability of being wrong and is .05. Step2: Find the appropriate value of t Use Table 18 in the Syllabus Supplement to find tn21 t.9025 2.262 (the number in the .025 column and the 9th row). Note that higher confidence levels and lower numbers of degrees of freedom give larger values of t , and thus larger confidence intervals. Step 3: Find the standard error. s x s 110 34.79. 10 Step 4: Put it together. x tn21 s x 838 2.26234.79 838 78.7 . The result n can be written P 759.3 916.7 .95. Or make a ‘Normal’ curve with 838 in the middle and 759.3 and 916.7 on the sides. Label the area between 759.3 and 916.7 with 95%, the area below 759.3 with 2.5% and the area above 916.7 with 2.5%. Example 2: Find a 98% confidence interval for the mean when x 22, n 100 s x 11. Step1: Confidence level is 98%, so that the significance level is .02. Step 2: Since s x is a sample standard deviation, use t n21 t.99 01 2.364 and Step 3: s x sx 11 1.1 n 100 Step 4: x tn21 s x 22 2.3641.1 22 2.60 . You should express this as an interval. Example 3: We visit a town of 5000 families. We take a sample of 900 families and find a sample mean of $8536 and a sample standard deviation of $436. Find a 90% s x 436, n 900 and N 5000. Step1: Confidence level is 90%, so that the significance level is .10. Step 2: Since the degrees of freedom are n 1 899, we run off the table. If the degrees of confidence interval for the mean. x 8536, freedom are much over 200, use the value from the infinity line. tn21 t.899 05 1.645. Step 3: This is the big change. Since the sample is more than 5% of the population, use the finite sx N n 436 4100 14.53330.9056 13.16 . Note population correction. s x n N 1 900 4999 that the smaller the population, the more the finite population correction will shrink the standard error. Step 4: x tn21 s x 8536 1.64513.16 8536 21.65 . You should express this as an interval. 4. A Confidence Interval for a Proportion See 251 proport. For other confidence intervals see Table 3. “Formulas for Confidence Intervals and Hypothesis Tests” at http://courses.wcupa.edu/rbove/eco252/252form.doc. ©2002 Roger Even Bove