2WS30 Mathematical Statistics 2WS30 – Parameter Estimation (Performance Charac.) Criteria for Evaluating Estimators So far we have proposed a number of ways of constructing estimators, but have not decided on what is a good estimator. Intuitively a good estimator is such that in some sense. To formalize this notion we need state a number of properties that (might) be desirable for estimators to have, just as unbiasedness, small mean squared error, low variance, consistency, and asymptotic efficiency. In what follows we mostly assume is an unknown scalar parameter, and is an estimator of . Measures of Error! Definition: Mean Squared Error Clearly, if this is zero then the estimator is perfect. There are of course other ways of measuring the error of an estimator, e.g. We will focus mainly on MSE, because it makes our life easy, but it is worth pointing out that sometimes this is not the most adequate error metric !!! Bias-Variance Decomposition! Definition: Bias and Variance Definition: Unbiased Estimator For unbiased estimators, the value of the estimate is “centered” around the true unknown parameter… The Importance/Irrelevance of the Bias! Bottom line: unbiasedness is a desirable, but not an essential property a good estimator should have. It turns out the mean squared error can be easily written in terms of bias and variance: Bias/Variance Decomposition! Theorem: Bias-Variance Decomposition not random! =0 Examples – Sample Mean! Therefore this estimator is unbiased !!! Examples – Sample Mean! So the variance gets smaller and smaller as we have more data Examples! Examples – Variance Estimator! Examples – Variance Estimator! Examples – Variance Estimator! This estimator is biased. However, the bias gets smaller as sample size increases… We can easily remove the bias simply multiplying the estimator by n/(n-1), giving rise to our familiar Definition: Sample Variance Biased vs. Unbiased Estimators! So far we have derived two estimators for the variance – one biased and one unbiased. Which one should we prefer? To answer this question we might look at the MSE of each of the estimators. For that we need to make further assumptions. Suppose the data is actually a sample from a normal distribution Examples! So, in terms of MSE the biased estimator is better (but not by much)!!! Examples! Examples! Examples! Unlike we had in the normal data case correcting for the bias in this case is very advantageous!!! So it all depends on the balance between bias and variance What’s the Best Possible Estimator?! So far we studied we characterized the MSE performance of concrete estimators, and saw that some are better than others. It makes sense to talk about the “best” estimator. As discussed before, unbiasedness might not be an essential property, but for now let’s focus only on unbiased estimators, so that the MSE equals the variance: Definition: Uniformly Minimum Variance Unbiased Est. What’s the Best Possible Estimator?! Definition: Uniformly Minimum Variance Unbiased Est. In other words, a UMVUE is the best among the class of all unbiased estimators. Why did we not consider Uniformly Minimum MSE Estimators? Let’s see why this is NOT a good idea: Example! Clearly this is not possible, and so this definition is rather stupid… This is why we will focus on unbiased estimators for now… What’s the Best Possible Estimator?! Definition: Uniformly Minimum Variance Unbiased Est. How can we find an UMVUE estimator, or prove that an certain estimator is an UMVUE ??? Seems difficult, but fortunately we can get a lower bound on the variance of any unbiased estimator. If this lower bound is attained then we know we found an UMVUE. Cramér-Rao Lower Bound! Theorem: Cramér-Rao Lower Bound * - There are also some mild technical assumptions that are needed, but let’s skip those for now. Fisher Information! Definition: Fisher Information The second expression has an appealing geometric interpretation. The Fisher information is related to the curvature of the log likelihood function around the true parameter value, if the log-likehood is very “flat” it is hard to estimate the parameter from data. Fisher Information! The Fisher information has a nice form when we are under the random sample scenario: Meaning of the Cramér-Rao Bound! Is the Cramér-Rao lower bound always achievable: NO !!! Are there estimators with lower MSE than that of the best unbiased estimator: YES !!! In a nutshell: for unbiased estimators the bound might not be achievable, and even if it is achieved there might exist (biased) estimators with lower MSE than what the bound predicts. Let’s see examples of those situations, as well as examples where the Cramér-Rao bound is achieved. Example! Example! Example! Is S2 the UMVUE of σ2? At this point we cannot say… Example! Same Example - Revised! Example! The two previous examples show that, in terms of MSE performance, the choice of parameterization makes a difference in regards to what a “good” estimator is… Example! Example! Example! CR Lower Bound Proof! Theorem: Cramér-Rao Lower Bound Cramér-Rao Lower Bound Proof! Theorem: Cramér-Rao Lower Bound What’s Next?! From the previous theorem it is clear that the likelihood function plays a crucial role in the theory of parameter estimation. Furthermore, it seems that all you need to build good estimators is an adequate “summary” statistic (e.g., in the election example you don’t need to know the voting intention of all the surveyed individuals, but rather how many of them vote for one of the candidates). We will see that these ideas can be formalized, and will give constructive ways to develop good estimators.