STATISTICAL INFERENCE PART VI HYPOTHESIS TESTING 1 TESTS OF HYPOTHESIS • A hypothesis is a statement about a population parameter. • The goal of a hypothesis test is to decide which of two complementary hypothesis is true, based on a sample from a population. 2 TESTS OF HYPOTHESIS • STATISTICAL TEST: The statistical procedure to draw an appropriate conclusion from sample data about a population parameter. • HYPOTHESIS: Any statement concerning an unknown population parameter. • Aim of a statistical test: test a hypothesis concerning the values of one or more population parameters. 3 NULL AND ALTERNATIVE HYPOTHESIS • NULL HYPOTHESIS=H0 – E.g., a treatment has no effect or there is no change compared with the previous situation. • ALTERNATIVE HYPOTHESIS=HA – E.g., a treatment has a significant effect or there is development compared with the previous situation. 4 TESTS OF HYPOTHESIS • Sample Space, A: Set of all possible values of sample values x1,x2,…,xn. (x1,x2,…,xn) A • Parameter Space, : Set of all possible values of the parameters. =Parameter Space of Null Hypothesis Parameter Space of Alternative Hypothesis = 0 1 H0:0 H1: 1 5 TESTS OF HYPOTHESIS • Critical Region, C, is a subset of A which leads to rejection region of H0. Reject H0 if (x1,x2,…,xn)C Not Reject H0 if (x1,x2,…,xn)C΄ • A test defines the critical region. • A test is a rule which leads to a decision to fail to reject or reject H0 on the basis of the sample information. 6 TEST STATISTIC AND REJECTION REGION • TEST STATISTIC: The sample statistic on which we base our decision to reject or not reject the null hypothesis. • REJECTION REGION: Range of values such that, if the test statistic falls in that range, we will decide to reject the null hypothesis, otherwise, we will not reject the null hypothesis. 7 TESTS OF HYPOTHESIS • If the hypothesis completely specify the distribution, then it is called a simple hypothesis. Otherwise, it is composite hypothesis. • =(1, 2) H0:1=3f(x;3, 2) Composite Hypothesis H1:1=5f(x;5, 2) If 2 is known, simple hypothesis. 8 TESTS OF HYPOTHESIS H0 is True Reject H0 Type I error P(Type I error) = H0 is False Correct Decision 1- Do not reject H0 Correct Decision 1- Type II error P(Type II error) = Tests are based on the following principle: Fix , minimize . ()=Power function of the test for all . = P(Reject H0)=P((x1,x2,…,xn)C) 9 TESTS OF HYPOTHESIS P Reject H 0 H 0 is true 0 P Type I error Type I error=Rejecting H0 when H0 is true max 0 max. prob. of Type I error P Reject H 0 H 1 is true 1 1 P Not Reject H 0 H 1 is true 1 max 1 max. prob. of Type II error 10 PROCEDURE OF STATISTICAL TEST 1. Determining H0 and HA. 2. Choosing the best test statistic. 3. Deciding the rejection region (Decision Rule). 4. Conclusion. 11 HOW TO DERIVE AN APPROPRIATE TEST Definition: A test which minimizes the Type II error (β) for fixed Type I error (α) is called a most powerful test or best test of size α. 12 MOST POWERFUL TEST (MPT) H0:=0 Simple Hypothesis H1:=1 Simple Hypothesis Reject H0 if (x1,x2,…,xn)C The Neyman-Pearson Lemma: Reject L 0 C x1 , x 2 , , x n : k L 1 L 0 k H0 if L L 1 P L k 0 k 1 P L k 1 Proof: Available in text books (e.g. Bain & Engelhardt, 1992, p.g.408) 13 EXAMPLES • X~N(, 2) where 2 is known. H0: = 0 H1: = 1 where 0 > 1. Find the most powerful test of size . 14 Solution 1 L( | X ) ( L ( 1 | X ) L(0 | X ) exp{ 2 exp{ 1 2 1 2 2 2 (X i ) 2 } for 0 i 1 n [( X i 1 ) ( X i 0 ) ]} 2 2 2 i 1 n [ 2 X i ( 0 1 ) ( 1 0 )]} k 2 2 2 i 1 2 X ( 1 0 ) 2 n ) exp{ n 1 2 2 ln k n ( 0 1 ) sin ce 0 1 ( 0 1 ) X c n ( 1 0 ) 2 ln k 15 Solution, cont. • What is c?: It is a constant that satisfies P(X c | 0 ) P( c 0 Z X~N(, 2). X 0 c 0 ) / n / n n since For a pre-specified α, most powerful test says, Reject Ho if X c Z 0 n X 0 Z n 16 Examples • Example2: See Bain & Engelhardt, 1992, p.g.410 Find MPT of Ho: p=p0 vs H1: p=p1 > p0 • Example 3: See Bain & Engelhardt, 1992, p.g.411 Find MPT of Ho: X~Unif(0,1) vs H1: X~Exp(1) 17 UNIFORMLY MOST POWERFUL (UMP) TEST • If a test is most powerful against every possible value in a composite alternative, then it will be a UMP test. • One way of finding UMPT is to find MPT by NeymanPearson Lemma for a particular alternative value, and then show that test does not depend on the specific alternative value. • Example: X~N(, 2), we reject Ho if Note that this does not depend on X 0 Z particular value of μ1, but only on the n fact that 0 > 1. So this is a UMPT of H0: = 0 vs H1: < 0. 18 UNIFORMLY MOST POWERFUL (UMP) TEST • To find UMPT, we can also use Monotone Likelihood Ratio (MLR). • If L=L(0)/L(1) depends on (x1,x2,…,xn) only through the statistic y=u(x1,x2,…,xn) and L is an increasing function of y for every given 0>1, then we have a monotone likelihood ratio (MLR) in statistic y. • If L is a decreasing function of y for every given 0>1, then we have a monotone likelihood ratio (MLR) in statistic −y. 19 UNIFORMLY MOST POWERFUL (UMP) TEST • Theorem: If a joint pdf f(x1,x2,…,xn;) has MLR in the statistic Y, then a UMP test of size • for H0:0 vs H1:>0 is to reject H0 if Yc where P(Y c0)=. • for H0:0 vs H1:<0 is to reject H0 if Yc where P(Y c0)=. 20 EXAMPLE • X~Exp() H0:0 H1:>0 Find UMPT of size . 21 EXAMPLE • Xi~Poi(), i=1,2,…,n Determine whether (X1,…,Xn) has MLR property. Find a UMP level α test for testing H0:=0 versus H1:<0. 22 GENERALIZED LIKELIHOOD RATIO TEST (GLRT) • GLRT is the generalization of MPT and provides a desirable test in many applications but it is not necessarily a UMP test. 23 GENERALIZED LIKELIHOOD RATIO TEST (GLRT) H0:0 H1: 1 L f x1 , x 2 , , x n ; r .s . f x1 ; , f x 2 ; , , f x n ; L L ˆ and Let L ˆ max MLE of ˆ max L L ˆ L 0 0 0 MLE of under H0 24 GENERALIZED LIKELIHOOD RATIO TEST (GLRT) ˆ L 0 The Generalize d Likelihood ˆ L Ratio GLRT: Reject H0 if 0 25 EXAMPLE • X~N(, 2) H0: = 0 H1: 0 Derive GLRT of size . 26 EXAMPLE • Let X1,…,Xn be independent r.v.s, each with shifted exponential p.d.f.: f (x | ,) 1 exp{ ( x ) / } I [ , ] ( x ) where λ is known. Find the LRT to test H0:=0 versus H1:>0. ASYMPTOTIC DISTRIBUTION OF −2ln • GLRT: Reject H0 if 0 • GLRT: Reject H0 if -2ln>-2ln0=c 2 ln under H 0 ~ asympt . k 2 where k is the number of parameters to be tested. Reject H0 if -2ln> ,k 2 28 TWO SAMPLE TESTS X ~ Bin n 1 , p 1 , r .s . Y ~ Bin n 2 , p 2 , r .s . H 0 : p1 p 2 p 0 H 1 : p1 p 2 Derive GLRT of size , where X and Y are independent; p0, p1 and p2 are unknown. 29