Hypothesis testing Some general concepts: Null hypothesis H0 A statement we “wish” to refute Alternative hypotesis H1 The whole or part of the complement of H0 Common case: The statement is about an unknown parameter, H0: H1: – ( \ ) where is a well-defined subset of the parameter space \ \ Simple hypothesis: (or – ) contains only one point (one single value) Composite hypothesis: The opposite of simple hypothesis Critical region (Rejection region) A subset C of the sample space for the random sample X = (X1, … , Xn ) such that we reject H0 if X C (and accept (better phrase: do not reject ) H0 otherwise ). The complement of C, i.e. C will be referred to as the acceptance region C C C is usually defined in terms of a statistic, T(X) , called the test statistic Simple null and alternative hypotheses Errors in hypothesis testing: Type I error Rejecting a true H0 Type II error Accepting a false H0 Significance level The probability of Type I error Also referred to as the size of the test or the risk level Risk of Type II error The probability of Type II error Power The probability of rejecting a false H0 , i.e. the probability of the complement of Type II error = 1 – Writing it more “mathematically”: Pr X C H 0 \ Pr X C H1 Pr X C H1 1 Pr X C H1 C C Classical approach: Fix and then find a test that makes desirably small A low value of does not imply a low value of , rather the contrary Most powerful test A test which minimizes for a fixed value of is called a most powerful test (or best test) of size Neyman-Pearson lemma x = (x1, … , xn ) a random sample from a distribution with p.d.f. f (x; ) We wish to test H0 : = 0 (simple hypothesis) versus H1 : = 1 (simple hypothesis) The the most powerful test of size has a critical region of the form Lθ1; x A Lθ0 ; x where A is some non-negative constant. Proof: Se the course book Note! Both hypothesis are simple Example: x x1 , , xn a random sample from Exp , i.e. with p.d.f. f x; e 1 1 x ;x0 Test H 0 : 0 vs. H1 : 1 where 1 0 with a test of size 1 n xi L ; x e The critical region for a most powerful test is n n 1 xi 1 e n n 0 xi 0 e Ae n 1 n 0 x A 1 i 0 n 1 n 0 n n xi ln A 1 ln A nln 1 ln 0 0 1 0 1 n 0 n 0 ln A nln 1 ln 0 xi B n n 1 0 T x xi B If 1 had been θ0 xi B Logical since E X How to find B ? If 1 > 0 then B must satisfy n Pr X i B 0 i 1 Use the result th at a sum of Exp distribute d variable s is Gamma distribute d , Gamman, , i.e. with T X X i fT t t n1 t B n 1 e t 0 0n n dt Solve for B (numerical ly) e t n n If the sample x comes from a distribution belonging to the oneparameter exponential family: Aθ1 i1 B xi i1 C xi nD θ1 Lθ1; x e n n Lθ0 ; x e Aθ0 i1 B xi i1 C xi nD θ0 n e n Aθ1 Aθ0 in1 B xi nD θ1 D θ1 If Aθ1 Aθ0 0 Critical region is T x in1 Bxi If Aθ1 Aθ0 0 Critical region is T x in1 Bxi “Pure significance tests” Assume we wish to test H0: = 0 with a test of size Test statistic T(x) is observed to the value t Case 1: H1 : > 0 The P-value is defined as Pr(T(x) t | H0 ) Case 2: H1 : < 0 The P-value is defined as Pr(T(x) t | H0 ) If the P-value is less than H0 is rejected Case 3: H1 : 0 The P-value is defined as the probability that T(x) is as extreme as the observed value, including that it can be extreme in two directions from H0 In general: Consider we just have a null hypothesis, H0, that could specify • the value of a parameter (like above) • a particular distribution • independence between two or more variables •… Important is that H0 specifies something under which calculations are feasible Given a test statistic T = t the P-value is defined as Pr (T is as extreme as t | H0 ) Uniformly most powerful tests (UMP) Generalizations of some concepts to composite (null and) alternative hypotheses: H0: H1: – ( \ ) Power function: θ PrX C θ i.e. a function of θ Size: sup θ θ A test of size is said to be uniformly most powerful (UMP) if θ * θ θ Ω where * θ is the power function of any other tes t of size If H0 is simple but H1 is composite and we have found a best test (Neyman-Pearson) for H0 vs. H1’: = 1 where 1 – , then if this best test takes the same form for all 1 – , the test is UMP. Univariate cases: H0: = 0 vs. H1: > 0 (or H1: < 0 ) usually UMP test is found H0: = 0 vs. H1: 0 usually UMP test is not found Unbiased test: A test is said to be unbiased if ( ) for all – Similar test: A test is said to be similar if ( ) = for all Invariant test: Assume that the hypotheses of a test are unchanged if a transformation of sample data is applied. If the critical region is not changed by this transformation, the test is said to be invariant. Consistent test: If a test depends on the sample size n such that ( ) = n ( ). If limn n ( ) = 1 the test is said to be consistent. Efficiency: Two test of the pair of simple hypotheses H0 and H1. If n1 and n2 are the minimum sample sizes for test 1 and 2 resp. to achieve size and power , then the relative efficiency of test1 vs. test 2 is defined as n2 / n1 (Maximum) Likelihood Ratio Tests Consider again that we wish to test H0: H1: – ( \ ) The Maximum Likelihood Ratio Test (MLRT) is defined as rejecting H0 if max Lθ; x θ max Lθ; x A θΩ •01 • For simple H0, gives a UMP test • MLRT is asymptotically most powerful unbiased • MLRT is asymptotically similar • MLRT is asymptotically efficient If H0 is simple, i.e. H0: = 0 the MLRT is simplified to Lθ0 ; x A ˆ L θML ; x Example x x1 , , xn random sample from Exp H 0 : 0 H1 : 0 n L ; x 1e 1 xi 1 n e xi i 1 ˆML x (according to earlier examples) 0n e x 1 0 i n x n xi e ln ln x 1 n ln x n ln 0 n x 01 xi n x 01n x 1 e e 0 0 n 0 x 1 Sampling distribution of Sometimes has a well-defined sampling distribution: e.g. A can be shown to be an ordinary t-test when the sample is from the normal distribution with unknown variance and H0: = 0 Often, this is not the case. Asymptotic result: Under H0 it can be shown that –2ln is asymptotically 2-distributed with d degrees of freedom, where d is the difference in estimated parameters (including “nuisance” parameters) between " max Lθ; x " and " max Lθ; x " θ θΩ Example Exp ( ) cont. ln n ln x n ln 0 n 0 x 1 d 1 as we estimate 0 parameters in the numerator of and 1 parameter ( ) in the denominato r 2n 2 ln 2n ln x 2n ln 0 x 1 is asymptotic ally 0 12 - distribute d when 0 (i.e. when H 0 is true) Score tests Test of H 0 : θ θ0 vs. H 0 : θ θ0 θ Ω θ0 Test statistic : ψ u T θ0 I θ01 uθ0 where l l l u θ0 , , , k 1 2 T Under H 0 is asymptotic ally k2 - distribute d and the test is asymptotic ally equivalent to the correspond ing MLRT Wald tests Test of H 0 : θ θ0 vs. H 0 : θ θ0 θ Ω θ0 Test statistic : θˆML θ0 T I θˆ θˆML θ0 ML Under H 0 is asymptotic ally k2 - distribute d and the test is asymptotic ally equivalent to the correspond ing MLRT Score and Wald tests are particularly used in Generalized Linear Models Confidence sets and confidence intervals Definition: Let x be a random sample from a distribution with p.d.f. f (x ; ) where is an unknown parameter with parameter space , i.e. . If SX is a subset of , depending on X such that Pr X : S X θ 1 then SX is said to be a confidence set for with confidence coeffcient (level) 1 – For a one-dimensional parameter we rather refer to this set as a confidence interval Pivotal quantities A pivotal quantity is a function g of the unknown parameter and the observations in the sample, i.e. g = g (x ; ) whose distribution is known and independent of . Examples: x a random sample from N ; 2 X is N 0,1 - distribute d and thus independen t of and 2 n X is t n1 - distribute d and thus independen t of and 2 s n n 1S 2 2 is χ n21 - distribute d and thus independen t of and 2 To obtain a confidence set from a pivotal quantity we write a probability statement as Prg1 g X ; θ g2 1 (1) For a one-dimensional and g monotonic, the probability statement can be re-written as Pr1 X 2 X 1 where now the limits are random variables, and the resulting observed confidence interval becomes 1 x ,2 x For a k-dimensional the transformation of (1) to a confidence set is more complicated but feasible. In particular, a point estimator of is often used to construct the pivotal quantity. Example: x a random sample from N ; 2 , and 2 unknown X is t n 1 - distribute d s n Pr t 2,n 1 1 s s Pr X t 2,n 1 X t 2,n1 1 n n s s 1 X X t 2,n1 and 2 X X t 2,n 1 n n 1 observed confidence interval for is X t 2,n 1 s n 1 x , 2 x x t s , x t 2,n 1 n s 2,n 1 n n 1S 2 is χ n21 - distribute d 2 2 2 n 1 S 2 Pr 1 2 2 1 2 2 n 1S 2 n 1 S 2 1 Pr 2 2 2 1 2 2 n 1 S 2 X 1 2 2 and 2 n 1 S 2 X 2 12 2 1 observed confidence interval for 2 is 12 x , 22 n 1s 2 n 1s 2 x 2 , 2 1 2 2 Using the asymptotic normality of MLE:s One-dimensional parameter : ˆML ~ N , I1 ˆ ML ~ N 0,1 Pr z 2 z 2 1 1 1 I I Approximat e 1 confidence interval for is ˆ z I 1,ˆ z I 1 ˆML 2 ML 2 ML k-dimensional parameter : θˆ ~ N θ; I θ-1 T ˆ θ θ I θ θˆ θ ~ χ k2 Ellipsoida l confidence set for θ can be constructe d Construction of confidence intervals from hypothesis tests: Assume a test of H0: = 0 vs. H1: 0 with critical region C( 0 ). Then a confidence set for with confidence coefficient 1 – is S X θ0 : X C θ0 where C θ0 is the acceptance region