Notes 1 - Wharton Statistics Department

Statistics 512 Notes I D. Small Reading: Section 5.1 Basic idea of statistical inference: Population Inference about population using statistical tools Sample of Data Statistical Experiment: Observe data X. The distribution of X is P. P( X  E )  "Probability X is in E" Model: Family of possible P’s. P   { P , } . We call  a parameter of the distribution. Examples: 1. Binomial model. Toss a coin n independent times. P(“Success”)=p. X=# of successes   [0,1] 2. Normal location model. Observe X=(X1,...,Xn), Xi independent and identically distributed (iid) with a normal distribution with unknown mean  and known variance 2. 1 1 f ( x; )  exp{ 2 ( x   ) 2 } 2 2   (, ) 3. Normal model with unknown mean and variance. Observe X=(X1,...,Xn), Xi iid with a normal distribution 2 with unknown mean  and unknown variance  .   (, )  (0, ) 4. Nonparametric model. Observe X=(X1,...,Xn), Xi iid real valued.   {all distributions on }   {cdf of distribution of Xi } 5. Survey sampling. There is a finite population of units 1,...,N that have variables Y1,...,YN associated with them. We observe Y for n of the units u1,...,un, i.e., we observe X1=Yu1,...,Xn=Yun.   {Y1 ,..., YN } We are usually interested in a particular function of  such Y1   YN as the population mean, N Two methods of choosing the units: (A) Sampling with replacement: u1,...,un are iid from the uniform distribution on {1,2,...,N}. (B) Sampling without replacement (simple random sample): Each unit will appear in the sample at most once. Each of the possible  N  samples has the same probability. n  If N is much greater than n, the two sampling methods are practically the same. Statistical Inference: Statement about some aspect of  based on a statistical experiment. Note: We might not be interested in the entire  but only some function of it, e.g., in Examples 3 and 4, we might only be interested in the mean of the distribution. Types of Inferences we will study: 1. Point estimation: Give best estimate of function of  we are interested. 2. Interval estimation (confidence intervals): Give an interval (set) in which function of  lies along with a statement about how certain we are that function of  lies in the interval. 3. Hypothesis testing: Choose between two hypotheses about  Point Estimation Goal of point estimation is to provide the single “best guess” of some quantity of interest g(  ). g(  ) is a fixed unknown quantity. A point estimator is any function of the data h(X). The point estimator depends on the data so h(X) is a random variable. Examples of point estimators: Binomial model: X~Binomial(n,p), n known Point estimator for p: h(X)=X/n Notation: We sometimes denote point estimator for a parameter by putting a hat on it, i.e., pˆ  X / n . Also we sometimes add a subscript n to denote the sample size, pˆ n  X / n . Normal model with unknown mean  and known or 2 unknown variance  X1   X n ˆ   X  Point estimator for  : n n Sampling distribution: A point estimator h(X) is a function of the sample so h(X) is a random variable. The distribution of a point estimator h(X) for repeated samples is called the sampling distribution of h(X). Example: Normal location model. Observe X=(X1,...,Xn), Xi independent and identically distributed (iid) with a normal distribution with unknown mean  and known 2 variance  . X   Xn ˆ n  X  1 n 2 Sampling distribution: ˆ n ~ N (  , n ) Properties of a point estimator: 1. Bias. The bias of an estimator of g ( ) is defined by bias [h(X1 ,…,X n )]  E [h(X1 ,…,X n )]-g( ) We say that h(X1,...,Xn) is unbiased if bias [h(X1 , , X n )]  0 for all   Here E refers to the expectation with respect to the sampling distribution of the data f ( x1 ,..., xn ; ) . It does not mean we are averaging over a distribution for  . An unbiased estimator is suitably “centered.” 2. Consistency: A reasonable requirement for an estimator is that it should converge to the true parameter value as we collect more and more information. A point estimator h(X1,...,Xn) of a parameter g(  ) is P consistent if h(X1,...,Xn)  g ( ) for all   . Recall definition of convergence in probability (Section P 4.2). h(X1,...,Xn)  g ( ) means that for all   0 , lim P[| h( X 1 ,..., X n )  g ( ) |  ]  0 . n  3. Mean Square Error. A good estimator should on average be accurate. A measure of the accuracy of an estimator is the average squared error of the estimator: MSE [h(X1 ,...,X n )]  E [{h(X1 ,...,X n )- }2 ] Example: Suppose that an iid sample X1,...,Xn is drawn from the uniform distribution on [0,  ] where  is an unknown parameter and the distribution of Xi is 1 0<x<  f X ( x; )   0 elsewhere Consider the following estimator of  : W=h(X1,...,Xn)=maxiXi Sampling distribution of W: If w<0, P( W  w )=0. If 0<w<  ,  w P(W  w)  P( X 1 ,..., X n  w)  [ P ( X 1  w]      If w   , P( W  w )=0. Thus, n n 0  n  w  FW ( w)       1  if w<0 if 0  w   if w> and  nwn 1  fW ( w)    n 0  0  w  elsewhere Bias:   nwn 1 0 0 n E [W]   wfW ( w)dw   w nwn 1 dw  (n  1) n  0  n    n  1   n 1      n 1 n 1 There is a bias in W but it might still be consistent. Bias  E [W ]    Consistency: Let Wn denote W for a sample of size n. For any   0 , P(| Wn   |  )  P(    Wn     )      n( wn )n 1  dwn  wnn      1      n  n   Note that for any   0 , it is possible to find an n making [(   ) /  ]n as small as desired. Thus, limn P(| Wn   |  )  1 and Wn is consistent. n

Notes 1 - Wharton Statistics Department

Related documents

Products

Support

Notes 1 - Wharton Statistics Department

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib