Hypothesis testing

Hypothesis testing Some general concepts: Null hypothesis H0 A statement we “wish” to refute Alternative hypotesis H1 The whole or part of the complement of H0 Common case: The statement is about an unknown parameter,   H0:    H1:    –  ( \ ) where  is a well-defined subset of the parameter space   \   \ Simple hypothesis:  (or  –  ) contains only one point (one single value) Composite hypothesis: The opposite of simple hypothesis Critical region (Rejection region) A subset C of the sample space for the random sample X = (X1, … , Xn ) such that we reject H0 if X C (and accept (better phrase: do not reject ) H0 otherwise ). The complement of C, i.e. C will be referred to as the acceptance region C C C is usually defined in terms of a statistic, T(X) , called the test statistic Simple null and alternative hypotheses Errors in hypothesis testing: Type I error Rejecting a true H0 Type II error Accepting a false H0 Significance level  The probability of Type I error Also referred to as the size of the test or the risk level Risk of Type II error  The probability of Type II error Power  The probability of rejecting a false H0 , i.e. the probability of the complement of Type II error = 1 –  Writing it more “mathematically”:   Pr X  C H 0    \    Pr  X  C H1   Pr X  C H1   1    Pr  X  C H1   C C Classical approach: Fix  and then find a test that makes  desirably small A low value of  does not imply a low value of  , rather the contrary Most powerful test A test which minimizes  for a fixed value of  is called a most powerful test (or best test) of size  Neyman-Pearson lemma x = (x1, … , xn ) a random sample from a distribution with p.d.f. f (x;  ) We wish to test H0 :  = 0 (simple hypothesis) versus H1 :  = 1 (simple hypothesis) The the most powerful test of size  has a critical region of the form Lθ1; x  A Lθ0 ; x  where A is some non-negative constant. Proof: Se the course book Note! Both hypothesis are simple Example: x  x1 , , xn  a random sample from Exp , i.e. with p.d.f. f  x;    e 1  1 x ;x0 Test H 0 :    0 vs. H1 :   1 where 1   0 with a test of size  1  n   xi L ; x    e  The critical region for a most powerful test is n  n 1  xi 1 e n  n  0  xi 0 e  Ae   n 1 n 0  x  A   1    i  0  n    1 n   0 n     n   xi  ln  A   1    ln A  nln 1  ln  0  0       1   0  1 n   0 n  0 ln A  nln 1  ln  0    xi   B n n 1   0   T  x    xi  B  If 1 had been  θ0   xi  B Logical since   E  X  How to find B ? If 1 > 0 then B must satisfy   n  Pr  X i  B    0      i 1 Use the result th at a sum of Exp  distribute d variable s is Gamma distribute d , Gamman,  , i.e. with T  X    X i  fT t   t n1    t B n 1  e t  0  0n  n  dt   Solve for B (numerical ly) e t   n  n  If the sample x comes from a distribution belonging to the oneparameter exponential family: Aθ1 i1 B  xi  i1 C  xi  nD θ1  Lθ1; x  e   n n Lθ0 ; x  e Aθ0 i1 B  xi i1 C  xi  nD θ0  n e n  Aθ1  Aθ0 in1 B  xi  nD θ1  D θ1   If  Aθ1   Aθ0   0  Critical region is T  x   in1 Bxi    If  Aθ1   Aθ0   0  Critical region is T  x   in1 Bxi    “Pure significance tests” Assume we wish to test H0:  = 0 with a test of size  Test statistic T(x) is observed to the value t Case 1: H1 :  > 0 The P-value is defined as Pr(T(x)  t | H0 ) Case 2: H1 :  < 0 The P-value is defined as Pr(T(x)  t | H0 ) If the P-value is less than  H0 is rejected Case 3: H1 :   0 The P-value is defined as the probability that T(x) is as extreme as the observed value, including that it can be extreme in two directions from H0 In general: Consider we just have a null hypothesis, H0, that could specify • the value of a parameter (like above) • a particular distribution • independence between two or more variables •… Important is that H0 specifies something under which calculations are feasible Given a test statistic T = t the P-value is defined as Pr (T is as extreme as t | H0 ) Uniformly most powerful tests (UMP) Generalizations of some concepts to composite (null and) alternative hypotheses: H0:    H1:    –  ( \ ) Power function:  θ   PrX  C θ  i.e. a function of θ Size:   sup  θ  θ A test of size  is said to be uniformly most powerful (UMP) if  θ    * θ   θ  Ω   where  * θ  is the power function of any other tes t of size  If H0 is simple but H1 is composite and we have found a best test (Neyman-Pearson) for H0 vs. H1’:  =  1 where  1   –  , then if this best test takes the same form for all  1   –  , the test is UMP. Univariate cases: H0:  = 0 vs. H1:  > 0 (or H1:  < 0 ) usually UMP test is found H0:  = 0 vs. H1:   0 usually UMP test is not found Unbiased test: A test is said to be unbiased if  ( )   for all    –  Similar test: A test is said to be similar if  ( ) =  for all    Invariant test: Assume that the hypotheses of a test are unchanged if a transformation of sample data is applied. If the critical region is not changed by this transformation, the test is said to be invariant. Consistent test: If a test depends on the sample size n such that  ( ) = n ( ). If limn  n ( ) = 1 the test is said to be consistent. Efficiency: Two test of the pair of simple hypotheses H0 and H1. If n1 and n2 are the minimum sample sizes for test 1 and 2 resp. to achieve size  and power   , then the relative efficiency of test1 vs. test 2 is defined as n2 / n1 (Maximum) Likelihood Ratio Tests Consider again that we wish to test H0:    H1:    –  ( \ ) The Maximum Likelihood Ratio Test (MLRT) is defined as rejecting H0 if  max Lθ; x  θ max Lθ; x  A θΩ •01 • For simple H0,  gives a UMP test • MLRT is asymptotically most powerful unbiased • MLRT is asymptotically similar • MLRT is asymptotically efficient If H0 is simple, i.e. H0:  =  0 the MLRT is simplified to Lθ0 ; x   A ˆ L θML ; x   Example x  x1 , , xn  random sample from Exp  H 0 :   0 H1 :    0 n L ; x     1e 1 xi 1   n e   xi i 1  ˆML  x (according to earlier examples)    0n e   x 1 0 i  n  x  n  xi e     ln   ln    x  1  n ln x  n ln  0  n  x  01  xi  n  x  01n x 1    e    e   0  0  n 0 x  1   Sampling distribution of  Sometimes  has a well-defined sampling distribution: e.g.   A can be shown to be an ordinary t-test when the sample is from the normal distribution with unknown variance and H0:  = 0 Often, this is not the case. Asymptotic result: Under H0 it can be shown that –2ln  is asymptotically  2-distributed with d degrees of freedom, where d is the difference in estimated parameters (including “nuisance” parameters) between " max Lθ; x  " and " max Lθ; x  " θ θΩ Example Exp ( ) cont. ln   n ln x  n ln  0  n 0 x  1 d  1 as we estimate 0 parameters in the numerator of  and 1 parameter ( ) in the denominato r 2n  2 ln   2n ln x  2n ln  0   x  1 is asymptotic ally 0 12 - distribute d when    0 (i.e. when H 0 is true) Score tests Test of H 0 : θ  θ0 vs. H 0 : θ  θ0 θ  Ω  θ0  Test statistic : ψ  u T θ0   I θ01  uθ0  where  l l l u θ0    , , ,  k  1  2 T    Under H 0  is asymptotic ally  k2 - distribute d and the test is asymptotic ally equivalent to the correspond ing MLRT Wald tests Test of H 0 : θ  θ0 vs. H 0 : θ  θ0 θ  Ω  θ0  Test statistic :    θˆML  θ0  T   I θˆ  θˆML  θ0 ML  Under H 0  is asymptotic ally  k2 - distribute d and the test is asymptotic ally equivalent to the correspond ing MLRT Score and Wald tests are particularly used in Generalized Linear Models Confidence sets and confidence intervals Definition: Let x be a random sample from a distribution with p.d.f. f (x ;  ) where  is an unknown parameter with parameter space , i.e.   . If SX is a subset of  , depending on X such that Pr X : S X  θ   1   then SX is said to be a confidence set for  with confidence coeffcient (level) 1 –  For a one-dimensional parameter  we rather refer to this set as a confidence interval Pivotal quantities A pivotal quantity is a function g of the unknown parameter  and the observations in the sample, i.e. g = g (x ;  ) whose distribution is known and independent of . Examples:  x a random sample from N  ;  2  X   is N 0,1 - distribute d and thus independen t of  and  2  n X  is t n1 - distribute d and thus independen t of  and  2 s n n  1S 2 2 is χ n21 - distribute d and thus independen t of  and  2 To obtain a confidence set from a pivotal quantity we write a probability statement as Prg1  g  X ; θ   g2   1   (1) For a one-dimensional  and g monotonic, the probability statement can be re-written as Pr1 X     2  X   1   where now the limits are random variables, and the resulting observed confidence interval becomes 1 x ,2  x  For a k-dimensional  the transformation of (1) to a confidence set is more complicated but feasible. In particular, a point estimator of  is often used to construct the pivotal quantity.   Example: x a random sample from N  ;  2 ,  and  2 unknown  X  is t n 1 - distribute d s n   Pr  t   2,n 1    1   s s    Pr X  t 2,n 1     X  t 2,n1    1  n n  s s      1 X  X  t 2,n1  and  2 X  X  t 2,n 1  n n  1   observed confidence interval for  is X    t 2,n 1 s n  1  x ,  2  x    x  t  s , x  t 2,n 1  n s   2,n 1  n n  1S 2 is χ n21 - distribute d 2 2  2   n  1 S 2   Pr 1 2    2   1   2      2  n  1S 2   n  1 S 2   1  Pr    2  2    2 1   2   2   n  1 S   2 X   1  2 2 and 2   n  1 S  2 X   2 12 2  1   observed confidence interval for  2 is   12 x ,  22  n  1s 2 n  1s 2    x    2 , 2 1 2    2  Using the asymptotic normality of MLE:s One-dimensional parameter  :  ˆML ~ N  , I1  ˆ     ML   ~ N 0,1  Pr  z 2   z 2   1   1  1   I I    Approximat e 1   confidence interval for  is ˆ  z  I 1,ˆ  z  I 1 ˆML     2 ML   2 ML   k-dimensional parameter  :  θˆ ~ N θ; I θ-1      T ˆ  θ  θ  I θ  θˆ  θ ~ χ k2  Ellipsoida l confidence set for θ can be constructe d Construction of confidence intervals from hypothesis tests: Assume a test of H0:  =  0 vs. H1:    0 with critical region C( 0 ). Then a confidence set for  with confidence coefficient 1 –  is   S X  θ0 : X  C θ0  where C θ0  is the acceptance region

Hypothesis testing

Related documents

Products

Support

Hypothesis testing

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib