Lecture XX Concentrated Likelihood Functions The more general form of the normal likelihood function can be written as: L X , 2 n i 1 2 Xi 1 exp 2 2 2 2 n 1 2 ln L ln 2 2 2 X n i 1 2 i This expression can be solved for the optimal choice of 2 by differentiating with respect to 2: ln L n 1 2 2 2 22 X n 2 i 1 i n i 1 X i 0 2 ˆ 2 MLE n 2 1 n 2 i 1 X i n 0 2 Substituting this result into the original logarithmic likelihood yields n 1 n 2 ln L ln i 1 X i 2 n 1 n 2 Xi i 1 1 n 2 2 j 1 X j n n 1 n n 2 ln i 1 X i 2 n 2 Intuitively, the maximum likelihood estimate of is that value that minimizes the mean square error of the estimator. Thus, the least squares estimate of the mean of a normal distribution is the same as the maximum likelihood estimator under the assumption that the sample is independently and identically distributed. The Normal Equations If we extend the above discussion to multiple regression, we can derive the normal equations. yi 0 1 xi i n 1 n 2 ln L ln yi 0 1 xi 2 n i 1 n 1 2 2 2 2 ln yi 2 0 yi 21 xi yi 0 2 01 xi 1 xi 2 n Taking the derivative with respect to 0 yields n 2 n i 1 n y i 1 xi 2 0 2 yi 2 0 21 xi 0 1 n 1 n i 1 yi 0 1 i 1 xi 0 n n 1 n 1 n 0 i 1 yi 1 i 1 xi n n Taking the derivative with respect to 1 yields n n 2 2 x y 2 x 2 x 0 i i 0 i 1 i n 2 y i 0 1 xi 2 i 1 1 n 1 1 n n 2 i 1 xi y i 0 i 1 xi i 1 1 xi n n n Substituting for 0 yields 1 n 1 n 1 n i 1 xi yi i 1 yi i 1 xi n n n 1 n 2 1 n 1 n 1 i 1 xi i 1 xi 1 i 1 xi 0 n n n 1 1 n x y x y i 1 i i i 1 i i 1 i 1 n 2 n 2 i 1 xi i 1 xi n n n n Properties of Maximum Likelihood Estimators Theorem 7.4.1: Let L(X1,X2,…Xn|q) be the likelihood function and let q^(X1,X2,…Xn) be an unbiased estimator of q. Then, under general conditions, we have V qˆ 1 2 ln L E 2 q The right-hand side is known as the Cramer-Rao lower bound (CRLB). The consistency of maximum likelihood can be shown by applying Khinchine’s Law of Large Numbers to 1 1 n Qn q ln Ln q i 1 ln f X i q n n which converges as long as Eln f X i ,q Asymptotic Normality Theorem 7.4.3: Let the likelihood function be L(X1,X2,…Xn|q). Then, under general conditions, the maximum likelihood estimator of q is asymptotically distributed as ln L ˆ q ~ N q , 2 q A 2 1