AMS 572 Lecture Notes #11 October 9, 2013 Ch. 9. Categorical Data Analysis Inference on Two Population Proportions: 𝐩𝟏 , 𝐩𝟐 Independent samples, both are large e.g. Suppose we wish to compare the proportions of smokers among male and female students in SBU. Two large independent samples: Population 1: p1 Population 2: p2 Sample 1: n1 , x1 , n1 x1 Sample 2: n2 , x2 , n2 x2 ( x1 5, n1 x1 5 , x2 5, n2 x2 5 ) ① Point estimator: p ̂1 ② By CLT, − p̂2 = pˆ1 ~ N ( p1 , X1 n1 − X2 n2 p1 (1 p1 ) ) n1 pˆ 2 ~ N ( p2 , p2 (1 p2 ) ) n2 Two samples are independent p ̂1 − p̂2 ~̇N (p1 − p2 , p1 (1−p1 ) n1 + p2 (1−p2 ) n2 ) ③ P.Q. for p1 p2 : Z= p̂1 − p̂2 − (p1 − p2 ) p (1 − p1 ) p2 (1 − p2 ) √ 1 + n1 n2 ~̇ N(0,1) Not a P.Q. 1 Z∗ = p̂1 − p̂2 − (p1 − p2 ) p̂ (1 − p̂1 ) p̂2 (1 − p̂2 ) √ 1 + n1 n2 ~̇ N(0,1) Yes, this is a P.Q. ④ 100(1-)% large samples CI for p1 p2 : 1 − α = P (−Zα ≤ Z ∗ ≤ Zα ) = P(p̂1 − p̂2 − Zα/2 ∗ S ≤ p1 − p2 ≤ p̂1 − p̂2 + Zα/2 ∗ S) 2 2 ̂ 1 (1−p ̂1 ) p Here, S = √ n1 + ̂2 (1−p ̂2 ) p n2 ⑤ Test H 0 : p1 p2 General case: H a : p1 p2 Test statistic: Z0 = p̂1 − p̂2 − ∆ H0 N(0,1) p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) ~̇ √ + n1 n2 p − value = P(Z0 ≥ z0 |H0 : p1 − p2 = ∆) At the significance level , we reject H 0 if Z0 Z or equivalently, if p-value < α. When =0, one often uses the following test statistic Z0 = p̂1 − p̂2 − 0 1 1 √p̂(1 − p̂) ( + ) n1 n2 H0 N(0,1) ~̇ Here we use the pooled proportion under the null hypothesis in the denominator: p̂ = n1 p̂1 + n2 p̂2 X1 + X2 = n1 + n2 n1 + n2 2 Example 1. A random sample of Democrats and a random sample of Republicans were polled on an issue. Of 200 Republicans, 90 would vote yes on the issue; of 100 democrats, 58 would vote yes. Let p1 and p2 denote respectively the proportions of all Democrats or all Republicans who would vote yes on this issue. (a) Construct a 95% confidence interval for (p1 - p2) (b) Can we say that more Democrats than Republicans favor the issue at the 1% level of significance? Please report the p-value. (c) Please write up the entire SAS program necessary to answer question raised in (b). Please include the data step. Solution: 58 0.58, n1 100, x1 58, n1 x1 42. (a) Democrats: pˆ1 100 90 0.45, n2 200, x2 90, n2 x2 110. Republicans: pˆ 2 200 The 100(1-α)% confidence interval for (p1 - p2) is pˆ 1 pˆ 2 Z 2 pˆ 1 1 pˆ 1 pˆ 2 1 pˆ 2 , pˆ 1 pˆ 2 Z n1 n2 2 pˆ 1 1 pˆ 1 pˆ 2 1 pˆ 2 n1 n2 After plugging in Z0.025 = 1.96 etc., we found the 95% CI to be [0.01, 0.25] x1 x2 58 90 . n1 n2 100 200 p̂1 − p̂2 − 0 0.58 − 0.45 Z0 = = ≈ 2.12 1 1 1 1 √p̂(1 − p̂) ( + ) √0.49(1 − 0.49) ( n1 n2 100 + 200) (b) Hypotheses are H 0 : p1 p2 v.s H a : p1 p2 . pˆ p − value = P(Z0 ≥ z0 |H0 : p1 − p2 = ∆) = 0.017 > 0.01. We cannot reject H 0 at 0.01 . Therefore, we can not say more Democrats favor the issue than the Republicans at the 1% significance level. (c) SAS code: Data Poll; Input Party $ outcome $ count; Datalines; Republican yes 90 Republican no 110 3 Democrats yes 58 Democrats no 42 ; Run; Proc freq data=poll; Tables party*outcome/chisq; Weight count; Run; Output: The SAS System The FREQ Procedure Table of Party by outcome Party outcome Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚no ‚yes ‚ Total ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Democrat ‚ 42 ‚ 58 ‚ 100 ‚ 14.00 ‚ 19.33 ‚ 33.33 ‚ 42.00 ‚ 58.00 ‚ ‚ 27.63 ‚ 39.19 ‚ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Republic ‚ 110 ‚ 90 ‚ 200 ‚ 36.67 ‚ 30.00 ‚ 66.67 ‚ 55.00 ‚ 45.00 ‚ ‚ 72.37 ‚ 60.81 ‚ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total 152 148 300 50.67 49.33 100.00 4 Statistics for Table of Party by outcome Statistic DF Value Prob ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Chi-Square 1 4.5075 0.0337 Likelihood Ratio Chi-Square 1 4.5210 0.0335 Continuity Adj. Chi-Square 1 4.0024 0.0454 Mantel-Haenszel Chi-Square 1 4.4924 0.0340 Phi Coefficient -0.1226 Contingency Coefficient Cramer's V 0.1217 -0.1226 Fisher's Exact Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cell (1,1) Frequency (F) 42 Left-sided Pr <= F 0.0226 Right-sided Pr >= F 0.9877 Table Probability (P) 0.0103 Two-sided Pr <= P 0.0377 Sample Size = 300 5 Inference on Several Proportions—the Chi-square Test (Large Sample) Def. Multinomial Experiment. We have a total of n trials (sample size=n) ① For each trial, it will result in 1 of k possible outcomes. ② The probability of getting outcome i is pi , and k p i 1 i =1 ③ These trials are independent. Example 2. Previous experience indicates that the probability of obtaining 1 healthy calf from a mating is 0.83. Similarly, the probabilities of obtaining 0 and 2 healthy calves are 0.15 and 0.02 respectively. If the farmer breeds 3 dams from the herd, find the probability of getting exact 3 health calves. Def. Multinomial Distribution Let X i be the number of trials resulted in i-th category out of a total of n trials and pi be the probability of getting i-th category outcome, then P( X 1 x1 , X 2 x2 ,..., X k xk ) n! p1x1 ... pk xk x1 ! x2 !...xk ! Solution: P(exact 3 health calves)= P( X1 1, X 2 1, X 3 1) + P( X 1 0, X 2 3, X 3 0) =0.015+0.572=0.59 *Relations to the Binomial Distribution (k=2) Category 1 2 Probability p1 =p p2 =1-p # trials X1 =x X 2 =n-x 6 P ( X 1 x, X 2 n x ) n! p1x1 p2 x2 ( nx ) p x (1 p)n x x !(n x)! Chi-square goodness of fit test Example 3. Gregor Mendel (1822-1884) was an Austrian monk whose genetic theory is one of the greatest scientific discovery of all time. In his famous experiment with garden peas, he proposed a genetic model that would explain inheritance. In particular, he studied how the shape (smooth or wrinkled) and color (yellow or green) of pea seeds are transmitted through generations. His model shows that the second generation of peas from a certain ancestry should have the following distribution. wrinkledgreen Theoretical probabilities p1 1 16 wrinkledyellow p2 3 16 smoothgreen p3 3 16 smoothyellow p4 9 16 n=556 General test: Test whether the theoretical probability is correct H 0 : p1 p10 , p2 p2 0 ,..., pk pk 0 H a : H 0 is not true ( xi ei )2 H0 2 W0 ~ k 1 ei i 1 k T.S where xi is the observed number of observations in category i ei is the expected count of the i-th category , ei n pi 0 At the significance level α, reject H 0 iff W0 k21,upper , 7 Solution: wrinkledgreen Theoretical probabilities p1 Observed count out of 556 X1 =31 Expected counts 1 16 e1 556 wrinkledyellow p2 1 16 3 16 smoothgreen p3 3 16 smoothyellow p4 9 16 X 2 =102 X 3 =108 X 4 =315 e2 =104.25 e3 =104.25 e4 =312.75 =34.75 1 3 3 9 H 0 : p1 , p2 , p3 , p4 16 16 16 16 H a : H 0 is not true k T.S W0 i 1 ( xi ei )2 H0 2 ~ k 1 ei 2 =7.815 0.604 < 3,0.05,upper At significance level 0.05, we cannot reject H 0 SAS Code: DATA GENE; INPUT @1 COLOR $13. @15 NUMBER 3.; DATALINES; YELLOWSMOOTH 315 YELLOWWRINKLE 102 GREENSMOOTH 108 GREENWRINKLE 31 ; * HYPOTHESIZING A 9:3:3:1 RATIO; PROC FREQ DATA=GENE ORDER=DATA; WEIGHT NUMBER; TITLE3 'GOODNESS OF FIT ANALYSIS'; TABLES COLOR / CHISQ NOCUM TESTP=(0.5625 0.1875 0.1875 0.0625); RUN; The SAS System GOODNESS OF FIT ANALYSIS The FREQ Procedure 8 Test COLOR Frequency Percent Percent ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ YELLOWSMOOTH 315 56.65 56.25 YELLOWWRINKLE 102 18.35 18.75 GREENSMOOTH 108 19.42 18.75 GREENWRINKLE 31 5.58 6.25 Chi-Square Test for Specified Proportions ~~~~~~~~~~~~~~~~~~~~~~~~~~ Chi-Square 0.6043 DF 3 Pr > ChiSq 0.8954 Sample Size = 556 Example 4. A classic tale involves four car-pooling students who missed a test and gave as an excuse of a flat tire. On the make-up test, the professor asked the students to identify the particular tire that went flat. If they really did not have a flat tire, would they be able to identify the same tire? To mimic this situation, 40 other students were asked to identify the tire they would select. The data are: Tire Left front Right front Left rear Right rear 11 15 8 6 Frequency At α=0.05, please test whether each tire has the same chance to be selected? Solution: 1 H 0 : p1 p2 p3 p4 4 H a : H 0 is not true n=40, ei =n pi =10 k W0 i 1 ( xi ei )2 2 4.6 3,0.05, upper 7.81 ei Fail to reject H 0 . 9 The chi-square goodness of fit test is an extension of the Z-test for one population proportion. Data: sample size n, x: successes with probability p n-x: failures with probability 1-p TS. Z 0 pˆ p0 H 0 ~ N (0,1) p0 (1 p0 ) n At α, reject H 0 iff | Z0 | Z /2 Success p1 p0 Expected Observed W0 Failure e1 np0 p2 1 p0 e2 n(1 p0 ) X2 n x X 1 =x ( x np0 ) 2 [n x n(1 p0 )]2 ( x np0 ) 2 ( p0 1 p0 )]2 = np0 n(1 p0 ) np0 (1 p0 ) x ( p0 )2 ( x np0 ) 2 n Z02 = np0 (1 p0 ) p0 (1 p0 ) / n iid k Recall: If Z1, Z 2, ....Z n ~ N (0,1) , then W Z i 2 ~ k 2 . 1 When k=1, W Z 2 ~ 12 Let Z~N(0,1), then W Z 2 ~ 12 P(| Z | Z /2 ) P( Z 2 Z2 /2 ) = P(W 1,2 ,upper ) The two tests are identical. 10 Exact Tests for Inference on Two Population Proportions Sir Ronald Aylmer Fisher FRS (17 February 1890 – 29 July 1962) was an English statistician, evolutionary biologist, geneticist, and eugenicist. Fisher is known as "a genius who almost single-handedly created the foundations for modern statistical science", http://en.wikipedia.org/wiki/Ronald_Fisher 11 1. Fisher’s exact test: A little bit of history (Fisher’s Tea Drinker): R.A. Fisher described the following experiment. Muriel Bristol, his colleague, claimed that when drinking tea, she could distinguish whether milk or tea was added to the cup first (she preferred milk first). To test her claim, Fisher designed an experiment with 8 cups of tea – 4 with milk added first and 4 with tea added first. Muriel was told that there were 4 cups of each type, and was asked to predict which four had the milk added first. The order of presenting the cups to her was randomized. It turned out that Muriel correctly identified 3 from each type. Now we are testing the null hypothesis that she did so by pure guessing, versus the alternative hypothesis that she could do better than pure guessing. The p-value of the test is derived as follows: p value P( X 3 | X Y 8) P( X 3 | X Y 8) P( X 4 | X Y 8) (34 )(14 ) ( 44 )(04 ) 8 8 0.229 0.014 0.243 (4 ) (4 ) Since the p-value is large, we could not reject the null hypothesis. It means that it is possible that she chose 3 correctly by pure guessing. Inference on 2 population proportions, 2 independent samples Example 5. The result of a randomized clinical trial for comparing Prednisone and Prednisone+VCR drugs, is summarized below. Test if the success and failure probabilities are the same for the two drugs. Drug Success Failure Row total Pred 14 7 n1 =21 PVCR 38 4 n2 =42 m=52 n-m=11 n=63 12 General setting: “S” “F” Total Sample1 x n1 -x n1 Sample2 y n2 -y n2 m=x+y n-m n H 0 : p1 p2 H a : p1 p2 p value pU P( X x | X Y m) H 0 : p1 p2 H a : p1 p2 p value pL P ( X x | X Y m) H 0 : p1 p2 H a : p1 p2 p value 2 min( PU , PL ) (nk1 )(mn2k ) min( m,n1 ) (kn1 )(mn2k ) (nm ) (nm ) kx k x x ( nk1 )(nm2 k ) (nk1 )(nm2 k ) ( nm ) ( nm ) kx k max(0,m n2 ) *** Please note that in this case, under the null hypothesis of equal proportions, the conditional distribution of observing x and m-x successes from the two samples respectively given that we have a total of m successes follows the hypergeometric distribution where the sampling is assumed to be with replacement. In the context of the example of comparing the percentages of female versus male smokers I gave you today, this means that given we have a total of m smokers from a total sample of size n where n1 are male and n2 are female, given that we assume the proportion of male and female smokers are the same, the distribution that we observe x smokers being male, and m-x smokers being female will follow a hypergeometric distribution, and thus the p-values are calculated accordingly relative to different alternative hypotheses. Solution: H 0 : p1 p2 H a : p1 p2 42 42 14 (21 (21 k )(52 k ) k )(52 k ) p value 0.016 <0.05 63 63 (52 ) (52 ) k max(0,52 42) k 10 14 Reject H 0 13 SAS code: Data trial; input drug $ outcome$ count; datalines; pred S 14 pred F 7 PVCR S 38 PVCR F 4 ; run; proc freq data=trial; tables drug*outcome/chisq; weight count; run; 2. McNemar’s test Inference on 2 population proportions- paired samples Example 6. A preference poll of a panel of 75 voters was conducted before and after a TV debate during the campaign for the 1980 presidential election between Jimmy Carter and Ronald Reagan. Test whether there was a significant shift from Carter as a result of the TV debate. Preference Preference after before Carter Reagan Carter 28 13 Reagan 7 27 General setting: Condition1 response Condition2 response Yes No Yes A=a, PA B=b, PB No C=c, PC D=d, PD 14 PA + PB + PC + PD =1, A+B+C+D=n, (A, B, C, D)~Multinomial P1 PA PB , P2 PA PC H 0 : p1 p2 H 0 : pB pC P(B=k| B+C=m) ~ Bin(m,p= pB ) pB pC 1 2 Under H 0 : p , P(B=k| B+C=m) ~ Bin(m,p=1/2) 1 H 0 : p 2 H 0 : p1 p2 ① H a : p1 p2 H : p 1 a 2 1 m p value pU P ( B b | B C m) ( )m (mk ) 2 k b 1 H0 : p H : p p 2 ② 0 1 2 H a : p1 p2 H : p 1 a 2 1 b p value pL P( B b | B C m) ( )m (mk ) 2 k 0 1 H0 : p H : p p 2 ③ 0 1 2 H a : p1 p2 H : p 1 a 2 p value 2 min( PL , PU ) Solution: 1 H0 : p H 0 : p1 p2 2 H a : p1 p2 H : p 1 a 2 15 1 20 p value ( ) 20 ( 20 k ) 0.1316 2 k 13 SAS code: Data election; input before $ after $ count; datalines; Carter Carter 28 Carter Reagan 13 Reagan Reagan 27 Reagan Carter 7 ; run; proc freq data=election; exact agree; tables before*after/agree; weight count; run; The SAS System The FREQ Procedure Table of before by after before after Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚Carter ‚Reagan ‚ Total ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Carter ‚ 28 ‚ 13 ‚ 41 ‚ 37.33 ‚ 17.33 ‚ 54.67 ‚ 68.29 ‚ 31.71 ‚ ‚ 80.00 ‚ 32.50 ‚ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reagan ‚ 7 ‚ 27 ‚ 34 ‚ 9.33 ‚ 36.00 ‚ 45.33 ‚ 20.59 ‚ 79.41 ‚ ‚ 20.00 ‚ 67.50 ‚ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total 35 40 75 46.67 53.33 100.00 Statistics for Table of before by after McNemar's Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Statistic (S) 1.8000 DF 1 Asymptotic Pr > S 0.1797 Exact Pr >= S 0.2632 (= 2*0.1316) Simple Kappa Coefficient ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Kappa (K) 0.4700 ASE 0.1003 95% Lower Conf Limit 0.2734 95% Upper Conf Limit 0.6666 16 Test of H0: Kappa = 0 ASE under H0 Z One-sided Pr > Z Two-sided Pr > |Z| Exact Test One-sided Pr >= K Two-sided Pr >= |K| 0.1140 4.1225 <.0001 <.0001 3.614E-05 5.847E-05 Sample Size = 75 Homework #5: The homework problems are one population proportion: 9.1, 9.2, 9.4, 9.6, 9.9; two or more population proportions: 9.11, 9.12, 9.19, 9.20, 9.23. 17