12.2 Hypothesis testing

12.2 Hypothesis Test about the Difference Large Sample Case ( n1  30, n2  30 ): I. Motivating Example : Objective: we want to test if the mean scores in two training center are different 1 : the mean score in the first training center.  2 : the mean score in the second training center. We want to test H 0 : 1   2 vs. H a : 1   2 ( H 0 : 1   2  0 vs. H a : 1   2  0 ) with   0.05 . In addition, n1  30, n2  40, x1  82.5, x2  78,  12  82 ,  22  102 A sensible statistical procedure would be where c reject H 0 : x1  x2  c not reject H 0 : x1  x2  c is some constant. Next Question: how to determine the value of c ? Answer: control  (the probability of making type I error) to determine the value of c . As H 0 is true,  1   2  X1  X 2  N 0,  X2 Then, 1 1X2 .   the probabilit y that wro ngly reject H 0   P ( H 0 is true but is rejected)  P 1   2 ; X 1  X 2  c  X X c 2  P 1    X X  X1  X 2 1 2   c X  z 1X2      c P Z    X1  X 2       12  c   X 1  X 2 z  n1 2 2    22 n2 z 2. Thus, reject H 0 : x1  x2   X 1  X 2 z not reject H 0 : x1  x2   X 1  X 2 z 2 2 is a sensible statistical procedure. Furthermore, denote z x1  x2 X 1  X2 x1  x2  12 n1 Thus, by dividing X 1X2   22 . n2 on the both sides, the above sensible statistical procedure can be simplified to reject H 0 : z  z not reject H 0 : z  z 2 2 In addition, p - value  the probabilit y of making type I error by rejecting H 0 at x1  x2 as 1   2  X X  x  x2 2  P X 1  X 2  x1  x2 , 1   2  P 1  1 , 1   2      X1  X 2  X1  X 2   P Z  z    Therefore, in this example, 2 z  x1  x 2 X Thus, we reject  1X2 82.5  78  2.09  2.09  1.96  z 0.025  z 8 2 10 2  30 40 2 H 0 . Also, p  value  P Z  z   P Z  2.09  0.0366  0.05   . we reject H0 General based on p-value. Case: significance n1  30, n2  30 and level of as   As  1 ,  2 are known, let z x1  x2  0 X 1  x1  x2  0 X2  12 n1 (I): H 0 : 1  2  0 vs.   22 . n2 H a : 1  2  0 Then, In addition, reject H 0 : z   z not reject H 0 : z   z p - value  PZ  z  (II): H 0 : 1  2  0 vs. H a : 1  2  0 Then, reject H 0 : z  z not reject H 0 : z  z 3 In addition, p - value  PZ  z  (III): H 0 : 1  2  0 H a : 1  2  0 vs. Then, In addition, reject H 0 : z  z not reject H 0 : z  z 2 2 p - value  P Z  z   As  1 ,  2 are unknown, let z x1  x2  0 x  x2  0  1 sX1  X 2 s12 s22 .  n1 n2 (I): H 0 : 1  2  0 vs. H a : 1  2  0 Then, In addition, reject H 0 : z   z not reject H 0 : z   z p - value  PZ  z  (II): H 0 : 1  2  0 vs. H a : 1   2   0 Then, reject H 0 : z  z not reject H 0 : z  z In addition, 4 p - value  PZ  z  (III): H 0 : 1  2  0 H a : 1   2   0 vs. Then, reject H 0 : z  z not reject H 0 : z  z In addition, 2 2 p - value  P Z  z  Example: Consider the following results for two samples randomly taken from two populations. Sample 1 Sample 2 Sample size 64 49 Mean 1150 921 Standard deviation 90 65 Let 1 and 2 be the population means. (a) For   0.05 , test H 0 : 1   2  200 (b) For   0.01, please use p-value to test using the classical hypothesis test. H 0 : 1   2  200 . (c) For   0.05 , please use the confidence interval method to test the hypothesis H 0 : 1   2  200 . [solution:] (a) x1  1150, x2  921, s1  90, s2  65, n1  64, n2  49, 0  200,  0.05 . Then, z x1  x2   0 s12 s22  n1 n2  1150  921  200 90 2 652  64 49 5  1.99  z  z0.05  1.645 Therefore, we reject H 0 . (b) p  value  PZ  z   PZ  1.99  0.0233    0.01 . Therefore, we do not reject H 0 . (c) A 95% confidence interval for x1  x 2   z  1   2 is s12 s 22 90 2 65 2   1150  921  z 0.025  n1 n 2 64 49 2  229  1.96 14.587   200.41,257.59 Since 200  200.41,257.59 , we reject . H0 . II. Small Sample Case ( n1  30, n2  30 ): Similar to 11.1, two assumptions are made: 1. Both populations have normal distribution. 2. The variance of the populations are equal (  1 2   22   2 ) Motivating Example : Objective: we want to test if the mean project-completion time using the new software package is shorter than using current technology 1 : the mean project-completion time using the current technology  2 : the mean project-completion time using the new software package. We want to test H 0 : 1   2 vs. H a : 1   2 ( H 0 : 1   2  0 vs. H a : 1   2  0 ) with   0.05 . In addition, 6 n1  12, n2  12, x1  325, x2  288, s1  40, s2  44 . Thus, s  2 p n1  1s12  n2  1s22 n1  n2  2 11  40 2  11  44 2   1768. 12  12  2 A sensible statistical procedure would be reject H 0 : x1  x2  c not reject H 0 : x1  x2  c where c is some constant. The above statistical procedure is equivalent to the following statistical test: reject H 0 : t  not reject H 0 : t  x1  x2 c   c   s X1  X 2 s X1  X 2 c s X 1  X 2  c As H 0 is true, X1  X 2 1   2  1 1   S   n n 2   1  T n1  n2  2 2 p where S p2 , is the sample statistic with possible values s 2p . Then,   0.05  the probabilit y that wro ngly reject H 0  P ( H 0 is true but is rejected)    X1  X 2  P     1 2 1  S p     n1 n2     P T n1  n2  2   c            c X1  X 2  c    P        1 1 1 2 1   s 2p     S    p   n1 n2    n1 n2    7  c  tn1 n2 2,  tn1 n2 2, 0.05 . Thus, reject H 0 : t  x1  x2  s X1  X 2 not reject H 0 : t  t n1 n2 2,0.05 x1  x2 1 1 s 2p     n1 n2   t n1 n2 2,  t n1 n2 2,0.05  t n1 n2 2, is a sensible statistical procedure. In addition, p - value  the probabilit y of making type I error by rejecting H 0 at x1  x 2 as 1   2       X1  X 2 x1  x 2  P  , 1   2    1 1 1 2 1 s 2p     S p      n1 n2   n1 n2     PT n1  n2  2  t  Therefore, in this example, t x1  x2 1 1 s 2p     n1 n2  Thus, we reject  325  288 1 1 1768    12 12   2.16  1.717  t 22,0.05  t n1 n2 2, H0 . Also, p  value  PT n1  n2  2  t   PT 22  2.16  0.0209  0.05   we reject H0 based on p-value. 8 General Case: significance as n1  30, n2  30 and level of  t x1  x2  0  sX 1  X 2 x1  x2  0 1 1  . s  n  n   2   1 2 p (I): H 0 : 1  2  0 vs. H a : 1  2  0 Then, reject H 0 : t  t n1  n2  2, not reject H 0 : t  t n1  n2  2, In addition, p - value  PT n1  n2  2  t  (II): H 0 : 1  2  0 vs. H a : 1   2   0 Then, reject H 0 : t  t n1  n2  2, not reject H 0 : t  t n1  n2  2, In addition, p - value  PT n1  n2  2  t  (III): H 0 : 1  2  0 vs. Then, 9 H a : 1  2  0 t  t n1  n2  2 , reject H 0 : t  t n1  n2  2 , not reject H 0 : 2 2 In addition, p - value  PT n1  n2  2  t  Example: Consider the following results for two samples randomly taken from two normal populations with equal variance Sample 1 Sample 2 Sample size 10 12 Mean 48 44 Standard deviation 9 8 (a) Test H 0 : 1   2  3 vs. H a : 1   2  3 at   0.1 using the classical hypothesis test. (b) Test H 0 : 1   2  4 vs. H a : 1   2  4 at   0.05 using p-value. (c) Test H 0 : 1   2  3 vs. H a : 1   2  3 at   0.05 using the confidence interval method. (d) At 95% confidence, how many data would have to be taken to provide an interval with length 6 given equal sample sizes in two populations? [solution:] (a) n1  10, n2  12, x1  48, x2  44, s1  9, s2  8, 0  3 . Then, s  2 p n1  1s12  n2  1s22 n1  n2  2 9  9 2  11  8 2   71.65 10  12  2 Thus, t  x1  x2   0 1 1 s 2p     n1 n2   48  44  3 1 1 71.65    10 12  Therefore, we reject H 0 . (b)  0  4 10  1.93  t n  n 2,  t 20,0.05  1.7247 1 2 2    p  value  P T n1  n2  2  t   P T n1  n2  2      x1  x2   0 1 1 s 2p     n1 n2                48  44  4  P T 20   1 1  71.65     12 10     PT 20  2.207   P T 20  2.086  0.05 Therefore, we reject H 0 . (c) A 95% confidence interval for 1   2 is 1 1 1 1 s 2p     48  44  t 20, 0.025 71.65     1 2 2  12 10   n1 n2   4  2.086  3.6243   3.56,11.56 x1  x2   t n n 2, Since 3   3.56,11.56 , we do not reject H 0 . (d) As sample sizes are large and equal sample sizes ( n1  n2  n ) in two populations, the 1   100% confidence interval for 1   2 is  x1  x2   z The length of the confidence interval is 2 z z 2 s12 s2  2  z0.025 n n s12 s 22  . n n 2 2 s12 s 22  . Therefore, n n 92 82   1.96 n n   92 82  3 n n 92 82 92  82  1.96 2  3      61.89  n n n 32  1.96  2     s12  s22 z2 n   2 , E : the marginal error 2   E     Therefore, n  62 and total 124 data need to be taken. 11 Online Exercise: Exercise 12.2.1 Exercise 12.2.2 12

12.2 Hypothesis testing

Related documents

Products

Support

12.2 Hypothesis testing

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib