Statistics 512 Notes 16: Efficiency of Estimators and the Asymptotic Efficiency of the MLE Method of moments estimator , X n iid f ( x; ), . Find E ( X i ) h( ) . X1 , 1 Method of moments estimator ˆMOM h ( X ) . Examples: (1) X 1 , , X n iid uniform (0, ) . E ( X i ) ˆMOM 2. 2X (2) X 1 , , X n iid logistic distribution exp{( x )} f ( x; ) x , . (1 exp{( x )}) 2 , E ( X i ) ˆ X MOM ˆ MLE exp{( X i )} n solves i 1 1 exp{( X )} 2 i n Efficiency of estimators: A good criterion for comparing estimators is the mean squared error: MSE (ˆ) E (ˆ )2 {Bias (ˆ)}2 Var (ˆ) For unbiased estimators, MSE (ˆ) Var (ˆ) Relative efficiency of two unbiased estimators: Let W1 and W2 be two unbiased estimators for with variances Var (W1 ) and Var (W2 ) respectively. We will call W1 more efficient than W2 if Var (W1 ) < Var (W2 ) . Also the relative efficiency of W1 with respect to W2 is Var (W2 ) / Var (W1 ) . Rao-Cramer Lower Bound: The concept of relative efficiency provides a working criterion for choosing between two competing estimators but it does not give us any assurance that even the better of W1 and W2 is any good. How do we know that there isn’t an unbiased estimator W3 which is better than both W1 and W2 ? The Rao-Cramer lower bound provides a partial answer to this question in the form of a lower bound. Theorem 6.2.1 (Rao-Cramer Lower Bound): Let X 1 , , X n be iid with pdf f ( x; ) for . Assume that the regularity conditions (R0)-(R4) hold. Let Y u ( X1 , , X n ) be a statistic with mean E (Y ) E [u( X1 , , X n )] k ( ) . Then [k '( )]2 Var (Y ) nI ( ) . Note (Corollary 6.2.1): If Y u ( X1 , , X n ) is an unbiased estimator of , then E (Y ) E [u ( X1 , , X n )] k ( ) so that k '( ) 1 . Thus for unbiased estimators Y u ( X1 , , X n ) , there is a lower bound on the variance 1 Var (Y ) nI ( ) Proof: The proof of this theorem is a clever application of the Cauchy-Schwarz Inequality or, stated statistically, the fact that for any two random variables V and W, [Cov(V ,W )]2 (VarV )(VarW ) (*) If we rearrange (*), we can get a lower bound on the variance of X, [Cov(V ,W )]2 VarV (**). VarW The cleverness in this theorem follows from choosing V to to be the estimator Y u ( X1 , , X n ) and W to be the quantity log f ( X 1 , , X n ; ) and applying the CauchySchwarz Inequality. First, we calculate Cov u ( X 1 , , X n ), log f ( X 1 , , X n ; ) . We have E u ( X 1 , u( x , 1 u( x , 1 log f ( X 1 , , X n ; ) , xn ) log f ( X 1 , , X n ; ) f ( x1 ) f ( xn )dx1 dxn f ( X 1 , , X n ; ) , xn ) f ( x1 ) f ( xn )dx1 dxn f ( X 1 , , X n ; ) , Xn) u ( x1 , E [u ( X 1 , , xn ) f ( X1, f ( X1, , X n ; ) f ( x1 ) , X n ; ) f ( xn )dx1 dxn , X n )] k '( ) Also we have E [ log f ( X 1 , , X n ; )] log f ( x1 , , xn ; ) f ( x1 , f ( x1 , , xn ; ) f ( x1 , , xn ; ) f ( x1 , , xn ; )dx1 dxn f ( x1 , f (x , Thus, 1 , xn ; ) f ( x1 , , xn ; ) Cov u ( X 1 , , X n ), , xn ; )dx1 , xn ; ) dx1 dxn dxn 0 log f ( X 1 , , X n ; ) k '( ) Finally, we calculate Var log f ( X 1 , n , X n ; ) Var i 1 log f ( X i ; ) 2 2 n i1Var log f ( X i ; ) n E log f ( X1; ) E log f ( X 1; ) n( I ( ) 0) nI ( ) Thus, using (**), we conclude that [k '( )]2 Var (Y ) nI ( ) Example: Let X 1 , , X n be iid Poisson( ). On your homework, you should have found that ˆMLE X 1 I ( ) From the properties of the Poisson distribution, we know that E ( X ) , Var ( X ) n . The Rao-Cramer lower bound for the variance of an 1 1 Var (Y ) unbiased estimator is nI ( ) n 1 n . Thus, the maximum likelihood estimator is efficient. The Rao-Cramer lower bound might not be achieved by any unbiased estimator. Asymptotic Optimality of MLE The maximum likelihood estimator is consistent so that its bias converges to 0 as n . Example 6.2.4 shows that the maximum likelihood estimator may not achieve the Rao-Cramer lower bound for finite samples. Under the regularity conditions assumed in Theorem 6.2.2, D 1 ˆ n MLE 0 N 0, ˆ ) I ( MLE Informally, Theorem 6.2.2 and its corollary say that the distribution of the MLE can be approximated by 1 N (0 , ). ˆ nI ( ) MLE Thus, the MLE is asymptotically unbiased and has variance equal to the Rao-Cramer lower bound. In this sense, the MLE is as efficient as any other estimator for large samples. For large enough samples, the MLE is the optimal estimator. Monte Carlo comparison of MSE for maximum likelihood vs. method of moments for the logistic distribution. +