13.3 Hypothesis Test: Multinomial Population Motivating example: Objective: we want to determine if a new product from company C has changed the market shares. pA : market share for company A pB : market share for company B pC : market share for company C We want to test H 0 : p A 0.3, pB 0.5, pC 0.2 vs. H a : The population proprotion s are not p A 0.3, pB 0.5, pC 0.2 with 0.05 . We also have the following information: total sample size: n 200 observed number (company A): f1 48 observed number (company B): f 2 98 observed number (company C): f 3 54 f1 f 2 f 3 48 98 54 200 n In addition, as H 0 is true, the expected number of company’s products are expected number (company A): e1 npA 200 0.3 60 expected number (company B): e2 npB 200 0.5 100 expected number (company C): e3 npC 200 0.2 40 . Intuitively, if the differences between fi 1 and e i , i 1, 2, 3 , are small, that might imply H 0 is true and thus the observed number and the expected number (under H 0 ) are close. On the other hand, if the differences between i 1, 2, 3 , fi and ei , are large, that might imply H 0 might not be true, the expected number (under H 0 ) would be significantly different from the “true” expected number and thus result in the difference between the observed number and the expected number. The following statistic can be used to reflect the difference between the observed number and the expected number, 3 f i ei 2 i 1 ei 2 2 2 f 3 e3 2 f1 e1 f 2 e2 e1 e2 e3 2 2 2 48 60 98 100 54 40 60 7.34 100 40 General Case: Suppose there are K populations. We want to test H 0 : p1 a1 , p2 a2 , , pk ak vs. H a : The population proprotion s are not p1 a1 , p2 a2 ,, pk ak where k a i 1 i a1 a 2 a k 1 We also have the following information: total sample size: nT observed numbers: f i , i 1, 2,, k . In addition, as H 0 is true, the expected numbers are expected numbers: ei nT ai , i 1, 2,, k 2 The test statistic is k 2 f i ei 2 f1 e1 2 f 2 e2 2 f k ek 2 ei i 1 e1 Next question: how large 2 e2 ek must be to reject H 0 ? Chi-Square Distribution: n2 : the random variable distributed as chi-square distribution with degrees of freedom Example: n. P x 0.1 x 22.3072 P x 0.9 x 4.168 P 32 x 0.05 x 7.814 2 15 2 9 Chi-Square Test: Let k 2 i 1 f i ei 2 f1 e1 2 f 2 e2 2 f k ek 2 ei e1 e2 The chi-square test with level of significance ek for H 0 : p1 a1 , p2 a2 , , pk ak vs. H a : The population proprotion s are not p1 a1 , p2 a2 ,, pk ak is to reject H 0 : 2 k21, not reject H 0 : 2 k21, 3 , k21, where can be obtained by P k21 k21, . In addition, p - value P k21 2 . Note: 2 As H 0 is true, the random variable with sample value is k21 . 2 : the sample statistic k21 : the random variable distributed as chi-square distribution with degree of freedom k-1 and sample value 2. k21, : the critical value satisfying P k21 k21, . Motivating Example (continue): Since k 3, 2 7.34 5.99 22,0.05 k21, , thus we reject H 0 . Example: The following data are the frequencies of products of throwing a dice 120 times: Point 1 2 3 4 5 6 Frequency 13 24 18 22 19 24 Please test if the dice is fair (i.e., H 0 : p1 p 2 p6 1 6) [solution:] ei 120 1 20, i 1,2, ,6. 6 4 with 0.05 . Then, 6 2 f i ei 2 13 202 24 202 18 202 ei i 1 20 20 20 2 2 2 22 20 19 20 24 20 20 20 20 4.5 Since 2 4.5 11.0705 52,0.05 k21, , we do not reject H0 . Example: The following are the number of wrong answers for the number of the students. Number of wrong answers 0 1 2 3 Number of the students 21 31 12 0 Suppose X is the random variable representing the number of wrong answers. Please test X is distributed as Binomial(3,0.25) with 0.05 . (Note: the distribution function for Binomial(3,0.25) is 3 x 3 x f x 0.25 0.75 , x 0,1,2,3. . x [solutions:] As H0 is true, the distribution for the number of wrong answers is 3 27 p1 P X 0 0.2500.753 , 64 0 3 27 p2 P X 1 0.2510.752 , 64 1 3 9 p3 P X 2 0.2520.751 , 64 2 5 3 1 p4 P X 3 0.2530.7501 , 64 3 n 21 31 12 0 64 , Since the sample size under H0 the expected numbers are 27 27 27, e2 np2 64 27, 64 64 . 9 1 e3 np3 64 9, e4 np4 64 1, 64 64 e1 np1 64 Therefore, 4 f k ek 2 k 1 ek 2 2 2 2 2 21 27 31 27 12 9 1 27 3.92 27 9 1 Since 2 3.92 7.81 32,0.05 k21, , we do not reject H0 . Online Exercise: Exercise 13.3.1 Exercise 13.3.2 6