Sufficient Statistics Dayu 11.11 Some Abbreviations • i.i.d. : independent, identically distributed Content • Estimator, Biased, Mean Square Error (MSE) and Minimum-Variance Unbiased Estimator (MVUE) When MVUE is unique? • Lehmann–Scheffé Theorem – Biased – Complete – Sufficient • the Neyman-Fisher factorization criterion How to construct MVUE is unique? • Rao-Blackwell theorem Estimator • The probability mass function (or density) of X is partially unknown, i.e. of the form f(x;θ) where θ is a parameter, varying in the parameter space Θ. Unbiased • An estimator ˆ t(x)is said to be unbiased for a function ˆ if it equals in expectation i.e. E{ˆ} • E.g using mean of a sample to estimate mean of the population x is unbiased n 1 n 1 1 n 1 E ( x ) E ( xi ) E ( xi ) E ( xi ) n n i 1 n i 1 n i 1 n Mean Squared Error (MSE) • MSE of an estimator T of an unobservable parameter θ is MSE(T)=E[(T- θ)2] • Since E(Y2)=V(Y)+[E(Y)]2 MSE(T)=var(T)+[bias(T)]2 where bias(T)=E(T- θ)=E(T)- θ • For the unbiased one, MSE=V(T) since biasd(T)=0 Examples Two estimators for σ2 : Results from MLE, biased, but smaller variance Unbiased, but bigger variance Minimum-Variance Unbiased Estimator (MVUE) • An unbiased estimator of minimum MSE also has minimum variance. • MVUE is an unbiased estimator of parameters, whose variance is minimized for all values of the parameters. • Two theorems – Lehmann-Scheffé theorem can show that MVUE is unique. – Constructing a MVUE: Rao-Blackwell theorem Lehmann–Scheffé Theorem • any estimator that is complete, sufficient, and unbiased is the unique best unbiased estimator of its expectation. • The Lehmann-Scheffé Theorem states that if a complete and sufficient statistic T exists, then the UMVU estimator of g(θ) (if it exists) must be a function of T. Completeness • Suppose a random variable X has a probability distribution belonging to a known family of probability distributions, parameterized by θ, • A function g(X) is an unbiased estimator of zero if the expectation E(g(X)) remains zero regardless of the value of the parameter θ. (by the definition of unbiased) • Then X is a complete statistic precisely if it admits (up to a set of measure zero) no such unbiased estimator of zero except 0 itself. Example of Completeness • suppose X1, X2 are i.i.d. random variables, normally distributed with expectation θ and variance 1. • Not complete: Then X1 — X2 is an unbiased estimator of zero. Therefore the pair (X1, X2) is not a complete statistic. • Complete: On the other hand, the sum X1 + X2 can be shown to be a complete statistic. That means that there is no non-zero function g such that E(g(X1 + X2 )) remains zero regardless of changes in the value of θ. Detailed Explanations • X1 + X2~(2θ,2) Sufficiency • Consider an i.i.d. sample X1, X2,.. Xn • Two people A and B: – A observe the entire sample X1, X2,.. Xn – B observes only one number T, T=T(X1, X2,.. Xn) • Intuitionly, Who has more information? • Under what condition, B will have as much information about θ as A has? Sufficiency • Definition: – A statistic T(X) is sufficient for θ precisely if the conditional probability distribution of the data X given the statistic T(X) does not depend on θ. • How to find?: the Neyman-Fisher factorization criterion: If the probability density function of X is f(x;θ), then T satisfies the factorization criterion if and only if functions g and h can be found such that • h(x): a function that does not depend on θ • g(T(x),θ): a function that depends on data only throught T(x) • E.g. • T=x1+x2+.. +xn is a sufficient statistic for p for Bernoulli Distribution B(p) g(T(x),p)∙1 h(x)=1 Example 2 Test T=x1+x2+.. +xn for Poisson Distribution Π(λ): h(x): independent of λ Hence, T=x1+x2+.. +xn is sufficient! g(T(x), λ) Notes on Sufficient Statistics • Note that the sufficient statistic is not unique. If T(x) is sufficient, so are T(x)/n and log(T(x)) Rao-Blackwell theorem • named after – C.R. Rao (1920- ) is a famous Indian statistician and currently professor emeritus at Penn State University – David Blackwell (1919-) is Professor Emeritus of Statistics at the UC Berkeley • describes a technique that can transform an absurdly crude estimator into an estimator that is optimal by the meansquared-error criterion or any of a variety of similar criteria. Rao-Blackwell theorem • Definition: A Rao–Blackwell estimator δ1(X) of an unobservable quantity θ is the conditional expected value E(δ(X) | T(X)) of some estimator δ(X) given a sufficient statistic T(X). – δ(X) : the "original estimator" – δ1(X): the "improved estimator". • The mean squared error of the Rao– Blackwell estimator does not exceed that of the original estimator. Conditional Expectation B {x X | f ( x) b} E ( f ( x) | B) P( x | x B) f ( x) xB 0, x B P( x) P( x | x B) xB P( x B) Example I • Phone calls arrive at a switchboard according to a Poisson process at an average rate of λ per minute. • λ is not observable • Observe: the numbers of phone calls that arrived during n successive one-minute periods are observed. • It is desired to estimate the probability e−λ that the next one-minute period passes with no phone calls. Original estimator: t=x1+x2+.. +xn is sufficient Example II • To estimate λ for X1 … Xn ~ P(λ) • Original estimator: X1 We know t= X1 +…+ Xn is sufficient • Improved estimator by R-B theorem: E[X1| X1 +…+ Xn =t] cannot compute directly We know Σ[E(Xi| X1 +…+ Xn =t)] =E(ΣXi| X1 +…+ Xn =t)=t • Since X1 … Xn are i.i.d. so every term is t/n In fact, it’s x Thank you!