M AXIMUM L IKELIHOOD M ETHODS Maximum Likelihood Estimation • In this chapter, we develop statistical inference (estimation and testing) based on likelihood methods. • Suppose that X1 , . . . , Xn are i.i.d. with common pdf f (x; θ), θ ∈ Ω. • Likelihood function: L(θ; x) = n Y f (xi ; θ), θ ∈ Ω. i=1 • Log likelihood: l(θ) = log L(θ) = n X i=1 log f (x; θ). M AXIMUM L IKELIHOOD M ETHODS Motivation Example • Let X1 , . . . , Xn denote a random sample from the distribution with pmf p(x) = θ x (1 − θ)1−x , x = 0, 1, zero elsewhere, where 0 < θ < 1. The joint pmf is x1 1−x1 θ (1 − θ) xn · · · θ (1 − θ) 1−xn =θ Pn i=1 xi P n− n i=1 xi (1 − θ) . The likelihood function is L(θ) = θ Pn i=1 xi (1 − θ) n− Pn i=1 xi . • We might ask what value of θ would maximize the probability L(θ) of obtaining this particular observed sample x1 , . . . , xn . This maximizing value of θ would be a good estimate of θ because it would provide the largest probability of this particular sample. M AXIMUM L IKELIHOOD M ETHODS • The log likelihood function l(θ) = log L(θ) = n X xi log θ + (n − i=1 dl(θ) = dθ Pn n X xi ) log(1 − θ). i=1 (n − i=1 xi ) − = 0. θ 1−θ n X 1 θ̂ = Xi = X̄ n i=1 i=1 xi Pn is called maximum likelihood estimator of θ . M AXIMUM L IKELIHOOD M ETHODS Theoretical Reasons • Assumptions: (Regularity Conditions). – R0: the pdfs are distinct; i.e., θ 6= θ0 ⇒ f (x; θ) 6= f (x; θ0 ). – R1: The pdfs have common support for all θ . – The point θ0 is an interior point in Ω. • Theorem 6.1.1. Let θ0 be the true parameters. Under assumptions R0 and R1, lim Pθ0 [L(θ0 , X) > L(θ, X)] = 1, for all θ 6= θ0 . n→∞ M AXIMUM L IKELIHOOD M ETHODS Maximum Likelihood Estimator • Maximum Likelihood Estimator(mle): we say that θ̂ = θ̂(X) is a maximum likelihood estimator if θ̂ = arg max L(θ; X). • Solve the equation ∂l(θ) =0 ∂θ to obtain the mle. This equation is called Estimation Equation(EE). M AXIMUM L IKELIHOOD M ETHODS Some Examples • Exponential Distribution: Suppose the common pdf is exponential(θ) is given by 1 −x f (x; θ) = e θ , θ x > 0. The log likelihood function is l(θ) = −n log θ − θ n X xi . i=1 n X ∂l(θ) == −nθ−1 + θ2 xi ∂θ i=1 So, θ̂ = X̄ is the mle of θ. M AXIMUM L IKELIHOOD M ETHODS • Laplace Distribution: Let X1 , . . . , Xn be i.i.d. with density 1 −|x−θ| f (x; θ) = e , −∞ < x < ∞, −∞ < θ < ∞. 2 This pdf is also called Double Exponential Distribution. The log likelihood is l(θ) = −n log 2 − n X |xi − θ|. i=1 n ∂l(θ) X = sgn(xi − θ), ∂θ i=1 where sgn(t) = 1, 0, −1 depending on whether t > 0, = 0, < 0. So the mle is the median of {x1 , . . . , xn }. M AXIMUM L IKELIHOOD M ETHODS • Logistic Distribution: Let X1 , . . . , Xn be i.i.d. with pdf exp(−(x − θ)) , −∞ < x < ∞, −∞ < θ < ∞. f (x; θ) = 2 {1 + exp(−(x − θ))} The log likelihood l(θ) = nθ − nx̄ − 2 n X log(1 + exp{−(xi − θ)}). i=1 0 l (θ) = n − 2 n X i=1 exp(−(x − θ)) 1 + exp(−(x − θ)) M AXIMUM L IKELIHOOD M ETHODS • Uniform Distribution: Let X1 , . . . , Xn be i.i.d. with uniform(0, θ), i.e., f (x) = 1/θ, x ∈ (0, θ]. The likelihood is L(θ) = θ−n I(max{xi }, θ) for all θ > 0, where I(a, b) is 1 or 0 if a ≤ b or a > b, respectively. Hence, mle is θ̂ = max{Xi }. M AXIMUM L IKELIHOOD M ETHODS • Let X1 , . . . , Xn be from Bernoulli(θ) with 0 < θ < 1/3. The mle is θ̂ = min{X̄, 1/3}. M AXIMUM L IKELIHOOD M ETHODS Some Appealing properties for MLE • Theorem 6.1.2. Let X1 , . . . , Xn be i.i.d. with pdf f (x; θ), θ ∈ Ω. For a specified function g , let η = g(θ) be a parameter of interest. Suppose θ̂ is the mle of θ . Then g(θ̂) is the mle of η = g(θ). • Theorem 6.1.3. Assume R0-R2. Then the likelihood equation ∂L(θ) =0 ∂θ or ∂l(θ) =0 ∂θ has a solution θ̂n such that θ̂n →p θ0 . M AXIMUM L IKELIHOOD M ETHODS An Exercise • Let X1 , . . . , Xn be random sample from a N (θ, σ 2 ) distribution, (a) Show that the mle of θ is X̄ . (b) If θ is restricted by 0 ≤ θ < ∞, show that the mle of θ is θ̂ = max{0, X̄}.