Estimation of µ In order to facilitate our discussion of using sample information to estimate µ, we first need to determine the distribution of the random variables: X1 = the first value sampled, X2 = the second value sampled, . . . Xn = the nth value sampled. To illustrate, suppose we have the following population X 0 1 2 p(x) .3 .4 .3 You should confirm that µ = 1 and σ2 = 0.6. Describe the distribution of X1. What values can it assume? What are the probabilities associated with these values? What is E[X1]? Var(X1)? Describe the distribution of X8. What values can it assume? What are the probabilities associated with these values? What is E[X8]? Var(X8)? 104 Estimators and Estimates An estimator is a rule which gives an estimate of a parameter. What is a parameter? An estimate is the actual number you get by using your estimator. Parameters and estimates are numbers. Estimators are not numbers, they are rules (functions or random variables). Example: Two possible estimators for µ when taking a sample of size 4 from an infinite population are: X where X1 X 2 X 3 X 4 4 Y X1 X 2 X 2 X 3 X1 2X 2 X 3 4 4 Xi = the ith value sampled. Which is better? How can we decide? We already know that E[ X ] . In addition, X 2X 2 X 3 E[Y ] E 1 4 1 E[ X 1 2 X 2 X 3 ] 4 1 {E[ X 1 ] E[2 X 2 ] E[ X 3 ]} 4 1 {E[ X 1 ] 2 E[ X 2 ] E[ X 3 ]} 4 1 { 2 } 4 1 4 4 Both estimators are said to be unbiased. 105 Definition: An estimator ˆ of a parameter θ is said to be unbiased if and only if E[ˆ] . Returning to our example, if the population that we are sampling from has values that follow a normal distribution, then both of our estimators X and Y have normal distributions under repeated sampling. Stated another way, their sampling distributions will be normal. Therefore, how should we decide between these estimators? Consider the following diagram. We have two normal distributions, each of which is completely characterized by its mean and variance (or standard deviation). Both distributions have the same mean, so the only possible difference is their standard deviations. Imagine for a moment that the distributions are accurately graphed. Which estimator would you use? Why? 106 We already know that Var ( X ) 2 n 2 4 . What is σ2? X 2X 2 X 3 Var (Y ) Var 1 4 4 Var( X 2 X X ) 1 {Var ( X ) Var (2 X ) Var ( X )} 16 1 {Var ( X ) 4Var ( X ) Var ( X )} 16 1 { 4 } 16 2 1 1 2 3 2 3 1 2 3 1 2 3 2 2 2 8 which is larger than Var ( X ) 2 4 . It is easy to cook up estimators with smaller variances than X . For example, W X1 X 2 X 3 X 4 5 also has a normal distribution if the underlying population is normal and Var (W ) 4 2 25 2 which is smaller than 4 . But what is “wrong” with W? It turns out that for all unbiased estimators that take a weighted average of sample values, X is the best in the sense that it has the smallest variance. 107 Remark: Recall we computed the sample variance using the formula n s2 ( xi x ) 2 i 1 n 1 There are several reasons why one might use n-1 in the denominator of s2, not the least of which is E[ s 2 ] 2 s2 becomes an unbiased estimate of σ2 when we divide by n-1. 108