251param

advertisement
251param 04/28/06
(Open this document in 'Outline' view!)
O. Estimation of Parameters.
1. Point and Interval Estimation. Properties of Estimators.

Let  be an estimator for  .
a. Unbiassedness E ˆ  
   .
b. Consistency (As sample size gets larger, estimate gets better.).
c. Efficiency ( ˆ has a small variance). Define BLUE.
d. Maximum Likelihood ( ˆ is the value of  that is most likely to have produced the observed
data)
2. A Confidence Interval for  When  is Known.
  x  z 2  x You can only use this when you know the population variance. Don’t forget that there are
two formulas for the standard error  x depending on sample size!
An interval of this type is used in two situations: (i) where the population variance,  2 , is, in fact,
known and the sample size is relatively large; or (ii) where the variance is not known and the sample
variance, s 2 , is used to replace  2 , but the degrees of freedom n  1 are so large that the appropriate
value of t n 1 is not very different from z . The first of these situations is not very realistic, but serves as a
good introduction to confidence intervals. The formula for this type of confidence interval for the mean is,
  x  z  x , where the standard deviation of the sample mean, called the standard error is  x  
2
n
.

N n
n N 1
( n is sample size and N is population size)
Example 1: Assume that a population is Normally distributed with an unknown mean and a population
standard deviation of 36. x ~ N ?,36 From a random sample of size n  9 , we get a sample mean of
62. (Because the population variance is known we can ignore any sample variance we might compute from
the data.). Find a 95% confidence interval for the mean.
Step 1: State the confidence level and significance level. The given confidence level of
95% represents the probability that the interval actually contains the mean and is stated as
1    .95. The significance level of 5% represents the probability of being wrong and is   .05.
Step 2: Find the appropriate value of z. Use the last line of Table 18 (or Table 17) in the
Syllabus Supplement to find z  z.025  1.960 (the bottom number in the .025 column). Note
Note: If n  .05 N , use  x 



2
z. , and thus larger confidence intervals.
36
Step 3: Find the standard error.  x  

 12. Note that larger values of n make
n
9
that higher confidence levels give larger values of
the standard error and the confidence interval smaller.
Step 4: Put it together.   x  z 2 x  62  1.96012  62  23.52 . The last part of
this expression means that the interval extends from 62 – 23.52 = 38.48 to 62 + 23.52 = 85.52. The
result can be written P 38.48    85.52  .95.


Example 2: Assume that a population is Normally distributed with an unknown mean
and a population standard deviation of 36. x ~ N ?,36 From a random sample of size n  9 , we get a
sample mean of 62. (Because the population variance is known we can ignore any sample variance we
might compute from the data.). This time find a 99% confidence interval for the mean.
Step 1: State the confidence level and significance level. The given confidence level of
99% represents the probability that the interval actually contains the mean and is stated as
1    .99. The significance level of 1% represents the probability of being wrong and is   .01.
Step 2: Find the appropriate value of z. Use the last line of Table 18 (or Table 17) in the
z 2  z.005  2.576 (the bottom number in the .005 column). Note
that higher confidence levels give larger values of z. , and thus larger confidence intervals.
36
Step 3: Find the standard error.  x  

 12. No change from example 1.
n
9
Step 4: Put it together.   x  z 2 x  62  2.57612  62  30.91 . The result can be
Syllabus Supplement to find


written P 31.09    92.91  .99. Or make a Normal curve with 62 in the middle and 31.09 and
92.91 on the sides. Label the area between 31.09 and 92.91 with 99%, the area below 31.09 with 0.5%
and the area above 92.91 with 0.5%.
Definitions:
Note that if we are considering the possibility that the population mean is 50, we can now say that since this
value is on the confidence interval, and since it is on the interval, we can say that the mean is not
significantly different from 50.
However the mean is significantly different from 20 or 100.
Remember that a confidence level is the probability that a given confidence interval is correct. The usual
interpretation of a confidence level of 95% is that if we take samples of n items and use the methods given
here many times, 95% of the time the interval will contain the population mean. The significance level is
the probability that the interval will not contain the population mean. If we say that the population mean is
significantly different from 20 and our significance level is 5%, we are saying that there is a probability of
5% or less that the observed data could have been generated by a distribution with a population mean of 20.
3. A Confidence Interval for  When  is not known.
  x  tn1 s x This is what you actually use most of the time! All that "  unknown" means is that we do
2
not have a value of the population variance. If you only have the sample variance, use the t table.
Finding degrees of freedom is easy. In most of the problems that we do the numbers of degrees of freedom
is one less than the sample size or n  1 . The value of t that you need should be in Table 18 in the
Syllabus Supplement. Each row represents the number of degrees of freedom given by the ‘df’ column. It is
a good idea to take a ruler and put a line across the table after every 10 th row. Note that the table skips
values after 100 degrees of freedom, but a good guess is always possible, for example t.110
05  1.659.
"The variance is not known " implies no previous knowledge or assumption about the value of the
population variance ,  2 . Knowing the sample variance, s 2 is having a good guess as to what the
variance is; it is not the same as knowing the variance. If the population distribution is normal or
approximately normal, the formula for a two-sided confidence interval for the mean is   x  tn1s x ,
2
where s x  s
. Note: If n  .05 N , use s x 
s
N n
. Be careful – It is a common error to think
N 1
n
n
that a new population size is actually a sample size.
Note: this is the more common case – if you do not know the population variance and the sample size
is not very large, using z instead of t is a very bad idea.
Example 1: We have a random sample of 10 homes. The sample mean of expenditures on maintenance
is $838 with a sample standard deviation of $110. Construct a 95% confidence interval for the mean.
Step1: State the confidence level and significance level. The given confidence level of 95%
represents the probability that the interval actually contains the mean and is stated as 1    .95.
The significance level of 5% represents the probability of being wrong and is   .05.
Step2: Find the appropriate value of t Use Table 18 in the Syllabus Supplement to find
tn21  t.9025  2.262 (the number in the .025 column and the 9th row). Note that higher confidence
levels and lower numbers of degrees of freedom give larger values of t , and thus larger confidence
intervals.
Step 3: Find the standard error. s x  s
110
 34.79.
10
Step 4: Put it together.   x  tn21 s x  838  2.26234.79  838  78.7 . The result

n


can be written P 759.3    916.7  .95. Or make a ‘Normal’ curve with 838 in the middle and
759.3 and 916.7 on the sides. Label the area between 759.3 and 916.7 with 95%, the area below 759.3
with 2.5% and the area above 916.7 with 2.5%.
Example 2: Find a 98% confidence interval for the mean when x  22, n  100
s x  11.
Step1: Confidence level is 98%, so that the significance level is   .02.
Step 2: Since s x is a sample standard deviation, use t n21  t.99
01  2.364
and
Step 3: s x 
sx

11
 1.1
n
100
Step 4:   x  tn21 s x  22  2.3641.1  22  2.60 . You should express this as an
interval.
Example 3: We visit a town of 5000 families. We take a sample of 900 families and
find a sample mean of $8536 and a sample standard deviation of $436. Find a 90%
s x  436, n  900 and N  5000.
Step1: Confidence level is 90%, so that the significance level is   .10.
Step 2: Since the degrees of freedom are n  1  899, we run off the table. If the degrees of
confidence interval for the mean. x  8536,
freedom are much over 200, use the value from the infinity line.
tn21  t.899
05  1.645.
Step 3: This is the big change. Since the sample is more than 5% of the population, use the finite
sx N  n
436 4100

 14.53330.9056  13.16 . Note
population correction. s x 
n N 1
900 4999
that the smaller the population, the more the finite population correction will shrink the standard error.
Step 4:   x  tn21 s x  8536  1.64513.16  8536  21.65 . You should express this as
an interval.
4. A Confidence Interval for a Proportion
See 251 proport. For other confidence intervals see Table 3. “Formulas for Confidence Intervals and
Hypothesis Tests” at http://courses.wcupa.edu/rbove/eco252/252form.doc.
©2002 Roger Even Bove
Download