MAXIMUM LIKELIHOOD ESTIMATES ARE ASYMPTOTICALLY NORMAL It happens that maximum likelihood estimates are asymptotically normal. This of course makes this estimation style incredibly useful! But why does this happen? Let’s show a partial proof for the case in which we have a sample X 1 , X 2 ,..., X n from a probability law f(x) with one parameter . We’ll have to use a Taylor series. This says that for any function h(y) h(y0) + h´(y0) (y - y0). Our likelihood is then n L = f bx g i i 1 and the log-likelihood is n log L = log f bx g i i 1 Now obtain the derivative with respect to . log L n log f bx g i i 1 Letting be the maximum likelihood estimate, let’s write this as a Taylor series about that . n n 2 log L log f xi 2 log f xi i 1 i 1 bg e j ej Now divide left and right sides by n : 1 log L n 1 n 1 n 2 log f xi log f xi n 2 n i 1 i 1 ej b g e j Now let’s examine this expression. The first summand is 1 n log f xi n i1 ej which is zero…. because this is the equation (aside from the get ! 1 n ) which we solve to gs2011 MAXIMUM LIKELIHOOD ESTIMATES ARE ASYMPTOTICALLY NORMAL Thus, we’ve reduced the relationship to this: 1 log L n 1 n 2 log f xi n i 1 2 b g e j We can write out the left side, too: 1 log L = n 1 n 1 n 2 log f x log f xi i n i 1 2 n i 1 bg b g e j 1 n log f xi , we can assert the Central Limit theorem! After all, it’s the n i 1 sum of n independent, identically distributed things. As each summand has mean zero, this limiting distribution is N(0, Var [ log f(xi) ] ), or N(0, I() ). Thus, we decide that the limiting distribution of bg Based on 1 n 2 log f xi n i 1 2 b g e j must also be N(0, I() ). Let’s rewrite this as L 1 M M Nn n i 1 2 2 ne j b gO P P Q log f xi Watch the n’s and the minus signs. The expression in the brackets certainly converges to I(); remember the calculating forms for I() and also the law of large numbers. Thus our result comes down to g af n e j ~ Nb0, Iaf I Certainly we can express this as F IJ G H afK 1 n ~ N 0, I e j This is of course the statement for the asymptotic normality of the maximum likelihood estimate. Many approximations were made. Also, several mathematical nuances were untouched. Nonetheless, this demonstration shows the essential features of the proof that maximum 1 likelihood estimates are asymptotically normal with variance . I af 2 gs2011