Statistics 512 Notes I D. Small Reading: Section 5.1 Basic idea of statistical inference: Population Inference about population using statistical tools Sample of Data Statistical Experiment: Observe data X. The distribution of X is P. P( X E ) "Probability X is in E" Model: Family of possible P’s. P { P , } . We call a parameter of the distribution. Examples: 1. Binomial model. Toss a coin n independent times. P(“Success”)=p. X=# of successes [0,1] 2. Normal location model. Observe X=(X1,...,Xn), Xi independent and identically distributed (iid) with a normal distribution with unknown mean and known variance 2. 1 1 f ( x; ) exp{ 2 ( x ) 2 } 2 2 (, ) 3. Normal model with unknown mean and variance. Observe X=(X1,...,Xn), Xi iid with a normal distribution 2 with unknown mean and unknown variance . (, ) (0, ) 4. Nonparametric model. Observe X=(X1,...,Xn), Xi iid real valued. {all distributions on } {cdf of distribution of Xi } 5. Survey sampling. There is a finite population of units 1,...,N that have variables Y1,...,YN associated with them. We observe Y for n of the units u1,...,un, i.e., we observe X1=Yu1,...,Xn=Yun. {Y1 ,..., YN } We are usually interested in a particular function of such Y1 YN as the population mean, N Two methods of choosing the units: (A) Sampling with replacement: u1,...,un are iid from the uniform distribution on {1,2,...,N}. (B) Sampling without replacement (simple random sample): Each unit will appear in the sample at most once. Each of the possible N samples has the same probability. n If N is much greater than n, the two sampling methods are practically the same. Statistical Inference: Statement about some aspect of based on a statistical experiment. Note: We might not be interested in the entire but only some function of it, e.g., in Examples 3 and 4, we might only be interested in the mean of the distribution. Types of Inferences we will study: 1. Point estimation: Give best estimate of function of we are interested. 2. Interval estimation (confidence intervals): Give an interval (set) in which function of lies along with a statement about how certain we are that function of lies in the interval. 3. Hypothesis testing: Choose between two hypotheses about Point Estimation Goal of point estimation is to provide the single “best guess” of some quantity of interest g( ). g( ) is a fixed unknown quantity. A point estimator is any function of the data h(X). The point estimator depends on the data so h(X) is a random variable. Examples of point estimators: Binomial model: X~Binomial(n,p), n known Point estimator for p: h(X)=X/n Notation: We sometimes denote point estimator for a parameter by putting a hat on it, i.e., pˆ X / n . Also we sometimes add a subscript n to denote the sample size, pˆ n X / n . Normal model with unknown mean and known or 2 unknown variance X1 X n ˆ X Point estimator for : n n Sampling distribution: A point estimator h(X) is a function of the sample so h(X) is a random variable. The distribution of a point estimator h(X) for repeated samples is called the sampling distribution of h(X). Example: Normal location model. Observe X=(X1,...,Xn), Xi independent and identically distributed (iid) with a normal distribution with unknown mean and known 2 variance . X Xn ˆ n X 1 n 2 Sampling distribution: ˆ n ~ N ( , n ) Properties of a point estimator: 1. Bias. The bias of an estimator of g ( ) is defined by bias [h(X1 ,…,X n )] E [h(X1 ,…,X n )]-g( ) We say that h(X1,...,Xn) is unbiased if bias [h(X1 , , X n )] 0 for all Here E refers to the expectation with respect to the sampling distribution of the data f ( x1 ,..., xn ; ) . It does not mean we are averaging over a distribution for . An unbiased estimator is suitably “centered.” 2. Consistency: A reasonable requirement for an estimator is that it should converge to the true parameter value as we collect more and more information. A point estimator h(X1,...,Xn) of a parameter g( ) is P consistent if h(X1,...,Xn) g ( ) for all . Recall definition of convergence in probability (Section P 4.2). h(X1,...,Xn) g ( ) means that for all 0 , lim P[| h( X 1 ,..., X n ) g ( ) | ] 0 . n 3. Mean Square Error. A good estimator should on average be accurate. A measure of the accuracy of an estimator is the average squared error of the estimator: MSE [h(X1 ,...,X n )] E [{h(X1 ,...,X n )- }2 ] Example: Suppose that an iid sample X1,...,Xn is drawn from the uniform distribution on [0, ] where is an unknown parameter and the distribution of Xi is 1 0<x< f X ( x; ) 0 elsewhere Consider the following estimator of : W=h(X1,...,Xn)=maxiXi Sampling distribution of W: If w<0, P( W w )=0. If 0<w< , w P(W w) P( X 1 ,..., X n w) [ P ( X 1 w] If w , P( W w )=0. Thus, n n 0 n w FW ( w) 1 if w<0 if 0 w if w> and nwn 1 fW ( w) n 0 0 w elsewhere Bias: nwn 1 0 0 n E [W] wfW ( w)dw w nwn 1 dw (n 1) n 0 n n 1 n 1 n 1 n 1 There is a bias in W but it might still be consistent. Bias E [W ] Consistency: Let Wn denote W for a sample of size n. For any 0 , P(| Wn | ) P( Wn ) n( wn )n 1 dwn wnn 1 n n Note that for any 0 , it is possible to find an n making [( ) / ]n as small as desired. Thus, limn P(| Wn | ) 1 and Wn is consistent. n