RMTD 404 Lecture 4 Chi Squares In Chapter 1, you learned to differentiate between quantitative (aka measurement or numerical) and (aka frequency or categorical) qualitative variables. Most of this course will focus on statistical procedures that can be applied to quantitative variables. This chapter, however, focuses on qualitative variables. This chapter describes 4 concepts relating to the use of the term chisquare (χ2). They are… 1. 2. 3. 4. A sampling distribution named the chi-square distribution. A statistical test for comparing marginal proportions of a categorical variable to theory-based proportions (goodness-of-fit chi-square). A statistical test for evaluating independence of two categorical variables (Pearson’s chi-square). A statistical test for comparing the relative fit of two models to the same data (Likelihood Ratio chi-square). Chi Squares • This is the probability density function that depicts the relationship between an observed score (x) and the height of the y axis as a function of the population mean and standard deviation f x exp{.5[( x ]2 } 2 2 Chi Squares The chi-square distribution takes the following form (k is the degree of freedom). Fortunately, you don’t really need to use these equations because tables exist that contain computed values of the areas under these curves and computer programs automatically produce these areas. f 2 2 e 2 2 1 2 k k 2 2 k 2 Chi-Squares An interesting feature of the chi-square distribution is how its shape changes as the parameter k increases. In fact, as k approaches , the chisquare distribution becomes normal in shape. k=5 k=6 k=7 k=8 Chi-Square Goodness of Fit • • One application of the chi-square distribution comes into play when we want to compare observed relative frequencies (percentages) to theorybased relative frequencies. Recall that in a hypothesis testing framework, we compare observed statistics to null parameters and determine the likelihood of obtaining the observed statistic due to sampling error under the assumption that the null parameter is correct. We use the chi-square goodness-of-fit test in contexts in which we have a single categorical variable, and we want to determine whether observed classifications are consistent with a theory. In this case, our observed statistics are the proportions associated with each classification, and our null parameters are the expected proportions for each classification. Chi-Square Goodness of Fit • • For example, we might be interested in determining whether a purposive sample that we have drawn is comparable to the US population with respect to ethnicity. That is, we want to determine whether the observed proportions of Asians, African Americans, Hispanics, and Caucasians in our sample reflect the proportions of these groups in the general population. In this case, the frequencies and proportions are observed, as in the following table. The theory-based null parameters (shown in the bottom row of the table) are obtained from the US Census. Asian Africa American Hispanic Caucasian Observed Frequencies 30 50 30 200 Observed Proportions (ng / N) .10 .16 .10 .65 Census Proportions .04 .12 .10 .74 Chi-Square Goodness of Fit • • • In the case of the chi-square goodness-of-fit test, the chi-square statistic is defined as: 2 k O E 2 O E 2 i k i Ei E i 1 O is the observed frequency, and k equals the number of classifications in the table (i.e., the number of cells). The expected value (designated E) is defined as the null proportion (i.e., theory-based) for that classification (ρ) times the sample size (N). Ei i N Also note the meaning of the expected frequencies (E). These values constitute what we would expect to be the values of the observed frequencies (O) if, indeed, our theory was true. That is, the expected number of cases in each group should be consistent with p (our theorybased proportions). Chi-Square Goodness of Fit • • Hence, the chi-square statistic tells us how far, on average, the observed cell frequencies are from the theory-based expectations. The table below shows the computations for the example. The sum of the last row, the chi-square statistic, equals 33.15. Asian Africa Hispanic White American Observed Frequencies 30 50 30 200 Expected Frequencies 12.4 37.2 31 229.4 Observed – Expected 17.6 12.8 -1.0 -29.4 (O – E)2 309.76 163.84 1.00 864.36 (O – E)2 / E 24.98 4.40 0.03 3.77 Chi-Square Goodness of Fit • We can compare our obtained chi-square value of 33.15, which has 3 degrees of freedom (degrees of freedom equals k – 1, the number of columns that are free to vary in the table), to the table values of the chi-square statistic—the range of values that occur due solely to random sampling. According to this table (Appendix on page 671), the critical value of the chi-square distribution with 3 degrees of freedom for α = .05 equals 7.82. • Hence, the observed differences between our sample and our expected values are extremely unlikely if the null hypothesis is true—that the vector of observed probabilities equals the vector of theory-based probabilities. Chi-Square Goodness of Fit • For the sake of being thorough, let’s summarize how we would utilize the chi-square goodness-of-fit test. 1. Determine which test statistic is required for your problem and data. The chi-square goodness-of-fit statistic is relevant when you want to compare the observed frequencies or proportions for a single categorical variable to the frequencies predicted by a theory. 2. State your research hypothesis—that the observed frequencies were not generated by the population described by your theory. 3. State the alternative hypothesis: that the observed proportions are not equal to the theory based proportions (i.e., ρobserved ρ theory—this is a non-directional test). 4. State the null hypothesis: that the observed proportions are equal to the theory-based proportions (i.e., ρ observed = ρ theory—here, ρ is the population parameter estimated by p, which is not the p-value but the proportion in each group observed). Chi-Square Goodness of Fit • Summary of how we would utilize the chi-square goodness-of-fit test … continued. 5. Compute your observed chi-square value. 6. Determine the critical value for your test based on your degrees of freedom and desired a level OR determine the p-value for the observed chi-square value based on its degrees of freedom. 7. Compare the observed chi-square value to your critical value OR compare the p-value for the observed chi-square statistic to your chosen α, and make a decision to reject or retain your null hypothesis 8. Make a substantive interpretation of your test results. Chi-Square Test of Association • • A second important application involving the chi-square distribution allows us to evaluate whether two categorical variables are related to one another (aka associated). If they are not related, we say that they are independent of one another (i.e., knowledge about one of the variables does not tell us anything about the other variable.). One way of depicting the relationship between two variables involves creating a contingency table (aka crosstab) showing the frequencies for each pairing of levels of the two variables. Consider the table below, a contingency table comparing the SES quartile of two Ethnicity groups. Note that the cell frequencies within a row or column constitute conditional totals, and the conditional totals in a row or margin sum to the marginal totals (i.e., row and column totals). Q1 Q2 Q3 Q4 Total Africa American 11 7 4 2 24 Caucasian 28 58 54 49 189 Total 39 65 58 51 213 Chi-Square Test of Association • • • The distinction between conditional and marginal distributions is an important one because it highlights the manner in which the Pearson chi-square test of association is linked to the hypothesis testing framework. Specifically, when we believe that there is no relationship between ethnicity and SES, we can predict cell frequencies based on the marginal frequencies. That is, when there is no relationship between the two variables, the conditional distributions of ethnicity across the SES quartiles should be similar enough to one another that we can conclude that any observed differences are due to sampling error. Specifically, when there is no association, all of the conditional distributions should merely be random deviations from the marginal distributions. Sample 1 gives us p11 and p21. Sample 2 gives us p12 and p22. Sample 3 gives us p13 and p23. Sample 4 gives us p14 and p24. All of these are assumed to differ from the marginal distribution, p1+ and p2+, due to sampling error. Q1 Q2 Q3 Q4 Total Africa American p11 p12 p13 p14 p1+ Caucasian p21 p22 p23 p24 p2+ Chi-Square Test of Association • To perform the Pearson chi-square test of association we do the following: 1. Determine which statistic is required for your problem and data. The chisquare test of association is relevant when you want to compare observed frequencies of two categorical variables to those implied by the tables marginal when no relationship exists. 2. State the research hypothesis: There is a relationship between ethnicity and SES in the population. 3. State the alternative hypothesis: Ethnicity and SES are associated in the population. 4. State the null hypothesis: Ethnicity and SES are independent (note that our test determines whether the observed ps are too different to have been generated from the same p due to random sampling variation). 5. Compute the chi-square statistic. Chi-Square Test of Association 6. 7. 8. Determine the critical value for your test based on your degrees of freedom [(R – 1)(C – 1)] and desired a level OR determine the p-value for the observed test statistic. Compare the observed chi-square value to your critical value or compare the p-value for the observed chi-square to your chosen a, and make a decision to reject or retain your null hypothesis. Make a substantive interpretation of your test results. Chi-Square Test of Association • • • Again, recall the formula for the chi-square statistic R C Oij Eij 2 O E 2 2 ij E E i 1 j 1 ij Note that in this case, we sum the values across both rows and columns. In the case of the test of association, we compute our expected values based on the marginal totals (rather than state them based on substantive theory). where Ri is the row total for the cell and Ri C j Cj is the column total for the cell. Eij However, justNbecause our expected values come from numbers does not mean that we are not imposing a substantive theory on their generation. In fact, we are imposing a substantive theory. Computing expected frequencies based on the marginal totals implies that the cell frequencies only depend on the joint distribution of ethnicity and SES. That is, the proportion of members in each ethnicity group will be the same at each level of SES. Hence, our computation of the expected value imposes a theory-based assumption that there is no relationship between ethnicity and SES. Chi-Square Test of Association • • • For our example, we need to determine the critical value for our hypothesis test. To do this, we need to state a level—we’ll use the traditional .05 level. We also need to know the degrees of freedom for the test—recall that the shape of the chi-square distribution changes depending on the value of k. Degrees of freedom is a concept that will reappear several times in this course. We use the term degrees of freedom to relay the fact that only a portion of the values in our data set are free to vary once we impose the null assumptions. In our case, only some of the cell frequencies are free to vary once we impose the marginal totals on the table. As shown, in our case only three of the cells are free to vary. In general, the degrees of freedom for a chi-square test of association are defined by Q1 Q2 Q3 Q4 Total df R 1C 1 Black 11 7 4 FIXED 24 White FIXED FIXED FIXED FIXED 189 Total 213 39 65 58 51 Chi-Square Test of Association • That means that the degrees of freedom for our example equals 3 [or (2 – 1)(4 – 1)]. From the chi-square table, we see that the critical value for our test equals 7.82. Chi-Square Test of Association • Here are expected frequencies for each cell. Q1 Q2 Q3 Q4 Total Africa American 4.39 7.32 6.54 5.75 24 Caucasian 34.61 57.68 51.46 45.25 189 Total • 39 65 58 51 213 And the difference between the observed and expected frequencies (aka residuals). Notice that blacks are over represented in the first SES quartile and whites are over represented in quartiles 2 through 4 under the null assumption of no association. Q1 Q2 Q3 Q4 Total Africa American 6.61 -0.32 -2.54 -3.75 24 Caucasian -6.61 0.32 Total 39 65 2.54 3.75 189 58 51 213 • Here are the squared difference between observed and expected frequencies. Q1 • Q2 Q3 Q4 Total Africa American 43.63 0.10 6.43 14.04 24 Caucasian 43.63 0.10 6.43 14.04 189 Total 39 65 58 51 213 And here are the squared differences divided by the expected frequencies. Each of these is equivalent to a chi-square with one degree of freedom. Based on this, we see that the largest deviations are due to the frequency of blacks in the extreme SES quartiles. The sum of these values is our chi-square statistic (15.07). Q1 Q2 Q3 Q4 Total Africa American 9.93 0.01 0.98 2.44 24 Caucasian 189 Total 1.26 0.00 0.12 0.31 39 65 58 51 213 Chi-Square Test of Association • • • Because our observed chi-square of 15.07 is greater than our critical value (7.82), we reject the null hypothesis and conclude that there is a relationship between ethnicity and SES. Note that the p-value for the observed statistic (which is reported by most statistical software—the critical value typically is not reported) equals .002 (less than α = .05), indicating that the observed pattern of cell frequencies is highly unlikely under the null assumption of independence. A substantive interpretation of this test might read something like this: A chi-square test of association indicated that ethnicity and SES are related, χ2(3) = 15.07, p = .002. Examination of the residuals indicates that blacks appear in the low SES category too frequently and in the high SES category too infrequently to assume that the observed frequencies are random departures from a model of independence. Example – Goodness of Fit • 6.1 The chairperson of psychology department suspects that some of her faculty members are more popular with students than are others. There are three sections of introductory psychology, taught at 10:00am, 11:00am, and 12:00pm by professors Anderson, Klatsky, and Kamm. The number of students who enroll for each is Professor Anderson 32 • Professor Klatsky 25 Professor Kamm 10 State the null hypothesis, run the appropriate chi-square test, and interpret the results. Example – Goodness of Fit • The null hypothesis in this study is that students enroll at random in the population. H0: πAnderson=πKlatsky=πKamm • Professor Professor Professor Anderson Klatsky Kamm Observed 32 25 10 Expected 22.3 22.3 22.3 (O E ) 2 2 E (32 22.3) 2 (25 22.3) 2 (10 22.3) 2 11.33 2 22.3 χ (2)=5.9922.3 The critical at 0.05 level. 22.3 So? • TOTAL 67 67 Example – Goodness of Fit • 6.2. The data in 6.1 will not really answer the question the chairperson wants answered. What is the problem , and how could the experiment be improved? Example – Test of Association • 6.8 We know that smoking has all sorts of ill effects on people; among other things, there is evidence that it affects fertility. Weinberg and Gladen (1986) examined the effects of smoking and the ease with which women become pregnant. The researchers asked 586 women who had planned pregnancies how many menstrual cycles it had taken for them to become pregnant after discontinuing contraception. Weinberg and Gladen also sorted the women into whether they were smokers or nonsmokers. The data follow. Smokers Nonsmokers Total • 1 Cycle 29 198 227 2 Cycles 16 107 123 3+ Cycles 55 181 236 Is there an association between smoking and pregnancy? Total 100 486 586 Example – Test of Association • The expected values are in the parentheses. 1 Cycle 29 (38.74) 198 (188.26) 227 Smokers Nonsmokers Total E smoker_1cycle • 2 Cycles 16 (22.70) 107 (110.30) 123 3+ Cycles 55 (40.27) 181 (195.73) 236 Total 100 486 586 227 *100 38.74 586 2 2 2 (29 38.74) (16 22.70) (181 195.73) 2 ... 38.74 22.70 195.73 11.54 The critical χ2 =5.99 at 0.05 level. So? (2)