Probability & Statistics Professor Wei Zhu July 30th Cramer-Rao Lower Bound, Efficient Estimator, Best Estimator ̂1 , θ ̂2 , θ ̂3 … Unbiased Estimator of θ, say θ ̂ i ) when there are many of them. It could be really difficult for us to compare Var(θ Theorem. Cramer-Rao Lower Bound ̂= Let Y1 , Y2 , … , Yn be a random sample from a population with p.d.f. f(y; θ). Let θ h(y1 , y2 , … , yn ) be an unbiased estimator of θ. Given some regularity conditions (continuous differentiable etc.) and the domain of f(yi ; θ). does not depend on θ. Then, we have Var(θ̂) ≥ 1 1 = ∂lnf(θ) 2 ∂2 lnf(θ) nE[( ) ] −nE [ ] ∂θ ∂θ2 Theorem. Properties of the MLE Let Yi i. i. d. f(y; θ), i = 1,2, … , n ~ ̂ be the MLE of θ, then Let θ θ̂ ⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑⃑ n → ∞ N (θ, 1 ) ∂lnf(θ) 2 nE[( ) ] ∂θ The MLE is asymptotically unbiased and its asymptotic variance : C-R lower bound 1 Harald Cramér (left) was born in Stockholm, Sweden on September 25, 1893, and died there on October 25, 1985 . (wiki). Calyampudi Radhakrishna Rao (right), FRS known as C R Rao (born 10 September 1920) is an Indian statistician. He is professor emeritus at Penn State University and Research Professor at the University at Buffalo. Rao was awarded the US National Medal of Science in 2002. (wiki) Example 1. Let Y1 , Y2 , … , Yn i. i. d. Bernoulli(p) ~ 1. MLE of p ? 2. What are the mean and variance of the MLE of p ? 3. What is the Cramer-Rao lower bound for an unbiased estimator of p ? Solution. P(Y = y) = f(y; p) = py (1 − p)1−y , y = 0,1 1. L = ∏ni=1 f(yi ; p) = ∏ni=1[pyi (1 − p)1−yi ] = p∑ yi (1 − p)n−∑ yi l = lnL = (∑ yi ) lnp + (n − ∑ yi )ln(1 − p) Solving: 2 dl dp = ∑ yi p − n − ∑ yi 1−p = 0, we have the MLE: ∑ni=1 Yi p̂ = n 2. 𝐸(p̂) = p, 𝑉𝑎𝑟(p̂) = p(1 − p) n 3. lnf(y, p) = ylnp + (1 − y) ln(1 − p) ∂ln f(y, p) y 1 − y = − ∂p p 1−p 2 ∂ ln f(y, p) y 1−y = 2− 2 ∂p p (1 − p)2 Y 1−Y 1 E [− 2 − ] = − (1 − p)2 p p(1 − p) C-R lower bound ∂2 lnf(y, p) −1 p(1 − p) Var(p̂) ≥ {−nE[ ]} = ∂p2 n Thus, the MLE of p is unbiased and its variance = C-R lower bound. Definition. Efficient Estimator ̂ is an unbiased estimator of θ and its variance = C-R lower bound, then θ ̂ is an efficient If θ estimator of θ. Definition. Best Estimator ̂ is an unbiased estimator of θ and var( θ ̂ )≤ var(θ̃) for all unbiased estimator θ̃, then θ ̂ is a If θ best estimator for θ. Efficient Estimator Always true ⇌ Best Estimator May not be true 3 Example 2. If Y1 , Y2 , … , Yn is a random sample from f(y; θ) = 2y θ2 3 , 0 < 𝑦 < 𝜃, then θ̂ = 2 ̅ Y is an unbiased estimator for θ. ̂ ) and 2. C-R lower bound for fY (y; θ) Compute 1. Var(θ Solution. 1. n n i=1 i=1 3 9 1 9 ̅) = Var ( ∑ Yi ) = Var(θ̂) = Var ( Y ∑ V ar(Yi ) 2 4 n 4n2 θ θ 2y 2y θ2 Var(Yi ) = E(Yi 2 ) − [E(Yi )]2 = ∫ y 2 2 dy − [∫ y 2 2 dy]2 = θ θ 18 0 0 2 2 ̂) = 9 nθ = θ Therefore, Var(θ 4n2 18 8n 2. C-R lower bound 2y ln fY (y; θ) = ln ( 2 ) = ln2y − 2lnθ θ ∂ln fY (y; θ) 2 =− ∂θ θ 2 θ ∂ ln fY (y; θ) 4 4 2y 4 E[( )] = E( 2 ) = ∫ 2 2 dy = 2 ∂θ θ θ 0 θ θ 1 ∂ ln fY (y; θ)2 nE[( )] ∂θ = θ2 4n The value of 1 is less than the value of 2. But, it is NOT a contradiction to the theorem. Because the domain, 0 < 𝑦 < 𝜃, depends on θ. Thus the C-R Theorem doesn’t hold for this problem. i .i .d . Example 3. Let X 1 , , X n ~ N ( , 2 ) , be a random sample from the normal population where both and 2 are unknown. Please derive (1) The maximum likelihood estimators for and 2 . (2) The best estimator for assuming that 2 is known. Solution: (1) MLEs for and 2 . The Likelihood function is: 4 n n n 1 L f ( xi ; , 2 ) i 1 2 i 1 2 e ( xi ) 2 2 n 2 ( 1 2 )e 2 ( xi )2 i 1 2 2 n ln L (n)ln( 2 2 ) (x ) i 1 2 i 2 2 n X ln L 2 ( xi ) 0 ˆ i n i 1 n ln L 1 ( n ) 2 2 2 (x ) i i 1 4 2 n 2 0 ˆ 2 ( x ˆ ) i 1 i n n 2 (x x ) i 1 2 i n (2) Use the Cramer-Rao lower bound: 1 f X ( , ) 2 ln f ln( d ln f d 2 2 1 2 2 (x ) ) e ( x )2 2 2 ( x )2 2 2 2 d 2 ln f 1 2 2 d Hence, the C-R lower bound of variance is 2 . n Since the normal pdf satisfies all the regularity conditions for the C-R lower bound theorem to hold, and since ̅ ~𝑵 (𝜇, 𝑿 𝜎2 ), 𝑛 ̅ equals to the C-R lower bound, and thus this unbiased estimator The variance of 𝑿 is an efficient estimator for 𝝁, and thus it is also a best estimator for 𝝁. 5