Statistics 512 Notes 12: Maximum Likelihood Estimation

Statistics 512 Notes 13: Properties of Maximum Likelihood Estimates L( )  L( ; X1 , , X n )  f ( X1 , , X n ) ˆMLE  max L( ; X1, , X n )  max l ( ; X1, , Xn ) Good properties of maximum likelihood estimates: (1) Invariance (2) Consistency (3) Asymptotic Normality (4) Efficiency Invariance (Theorem 6.1.2): Let X 1 , , X n be iid with the pdf f ( x; ),    . For a specified function g, let   g ( ) be a parameter of interest. Suppose ˆ is the mle of  . Then g (ˆ) is mle of   g ( ) . Proof: For each  in the range of g, define the set g 1 ( )  { : g ( )  } The maximum occurs at ˆ and the domain of g is  which covers ˆ . Hence ˆ is in one of these sets and , in fact can only be in one set. Hence to maximize L ( ) , choose ̂ so 1 that g (ˆ ) is the unique set containing ˆ . Then ˆ  g (ˆ) . Example: X 1 , , X n iid Bernoulli (p). The large sample confidence sample interval for p is pˆ  1.96 Varˆ( pˆ ) The MLE for p̂ is pˆ MLE   n i 1 n Xi . p(1  p) . n By invariance the maximum likelihood estimate of pˆ (1  pˆ ) Var ( pˆ MLE ) is . n The variance of the MLE is Var ( pˆ MLE )  Consistency: Consistency means that the MLE converges in probability to the true value. To proceed, we need a definition. If f and g are PDF’s define the Kullback-Leibler distance between f and g to be  f ( x)  D( f , g )   f ( x) log  dx g ( x )   It can be shown that D ( f , g )  0 and D( f , f )  0 . For any  ,   to mean D( f ( x), f ( x)) . We say that the model is identifiable if    implies that D ( , )  0 . This means that different values of the parameter correspond to different distributions. We will assume that the model is identifiable. Let 0 denote the true value of  . Let ln ( ) denote the log likelihood of  based on an iid sample X 1 , , X n . Maximizing ln ( ) is equivalent to maximizing f (X ) 1 M n ( )   log  i . n i f0 ( X i ) By the law of large numbers, M n ( ) converges to   f ( x)  f ( X i )  E0  log    log   f0 ( x)dx      f0 ( X i )    f0 ( x)   f0 ( x)    log   f0 ( x)dx   D( , 0 ) f ( x )    Hence, M n ( )   D(0 , ) which is maximized at 0 since  D(0 ,0 )  0 and  D(0 , )  0 for    0 . Therefore, we expect that the maximizer will tend to 0 . To prove this P formally, we need more than M n ( )  D( 0 ,  ) . We need this convergence to be uniform over  . We also have to make sure that the function D(0 , ) is well behaved. Here are the formal details Theorem: Let 0 denote the true value of  . Define f (X ) 1 M n ( )   log  i n i f0 ( X i ) and M ( )   D(0 , ) . Suppose that P sup  | M n ( )  M ( ) |  0 and that for every   0 , sup :| * | M ( )  M ( 0 ) P Let ˆn denote the MLE. Then ˆn   0 . Proof: Since ˆn maximizes M n ( ) , we have M (ˆ )  M ( ) . Hence, n n n 0 M ( 0 )  M (ˆn )  M n ( 0 )  M (ˆn )  M ( 0 )  M n ( 0 )  M (ˆ )  M (ˆ )  M ( )  M ( ) n n n 0 n 0  sup | M n ( )  M ( ) |  M ( 0 )  M n ( 0 ) P 0 where the last line follows from the assumption P sup  | M n ( )  M ( ) |  0 . Pick any   0 . By the assumption sup :| * | M ( )  M ( 0 ) , there exists   0 such that |   0 |  implies that M ( )  M (0 )   . Hence, P(| ˆ   |  )  P(M (ˆ )  M ( )   )  0 . n 0 n 0

Statistics 512 Notes 12: Maximum Likelihood Estimation

Related documents

Products

Support

Statistics 512 Notes 12: Maximum Likelihood Estimation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib