Chapter 26 Chi-Square Testing Chi-Square Testing • About the Chi-Square Distribution: The chi-square distributions are a family of distributions that take only positive values and are skewed to the right. Chi-squared distributions vary depending on degrees of freedom where df is (n-1) and n represents the number of categories for your variable. Chi-Square Testing • About the Chi-Square Distribution: • Each chi-square density curve has the following properties: • 1) The total area under each chi-square curve is 1. • 2) It begins at zero on the horizontal axis, increases to a peak, and then approached the horizontal axis asymptotically from above. • 3) Each curve is skewed to the right. As the number of degrees of freedom increase, the curve becomes more and more symmetrical and looks more like a normal curve (CLT still says this will happen). Chi-Square Testing • Chi-square Tests are *Used with 2-way tables to test for association or independence with multiple proportions. *Used to describe relationships within one or between two categorical variables. Large 2 values (which equate to low p-values) are evidence against Ho. Chi-Square Testing • Conditions: • Random: the data comes from a random sample or a randomized experiment • Independent: individual observations are independent or are less than 10% of a large population • Large enough sample: all expected counts are at least 5 (this is essentially the np rules for each category) Chi-Square Testing • There are Three Chi-Square Tests *Test for Goodness of Fit (GOF) *Test for Homogeneity *Test for Independence Chi-Square (2) test for Goodness of Fit (GOF) • Rather than testing individual proportions in an entire distribution, this test can be applied to see if the observed sample distribution is different from the hypothesized population distribution. (this is like doing many one proportion z-tests all at the same time) Chi-Square GOF Test Ho: the actual population proportions are equal to the hypothesized proportions Ha: at least one of the actual population proportions is different from the hypothesized proportions Chi-Square GOF Test • The 2 test statistic is: • 2 = (O – E)2/E with (n-1) degrees of freedom** where O – observed value E – expected value • ** Remember n is the number of categories this time, not the sample size Chi-Square GOF Test • • • • • • Calculator Steps for GOF on TI-83 Plus: 1) Clear L1, L2, L3. 2) Enter the observed counts in L1. 3) Calculate expected counts and enter them in L2. 4) Define L3 to be (L1 – L2)2/L2 5) the command Sum(L3) returns the test statistic 2. • 6) Use the 2 cdf command from the distributions menu to ask for the area between your 2 value and a very large #, and specify the degrees of freedom. This is your p-value. Chi-Square GOF Test • • • • • Calculator Steps for GOF on TI-84 Plus: 1) Clear L1, L2, L3. 2) Enter the observed counts in L1. 3) Calculate expected counts and enter them in L2. 4) Stat: Tests: D: 2 GOF Chi-Square Tests for Homogeneity • Test for Homogeneity: Attempts to test determine whether two populations are similar (homogeneous) with respect to the categories of one variable. Chi-Square Tests for Homogeneity • Ho: The population proportions with respect to the variable are the same. • Ha: The population proportions with respect to the variable are different. Chi-Square Test for Independence • Test for Independence: One population is sampled and two characteristics are observed. Is there an association (dependence) between the two characteristics. Chi-Square Test for Independence • Ho: States there is no association between the two variables. (The variables are independent) • Ha: States that there is an association between the two variables. (The variables are not independent) Chi-Square Tests for Homogeneity & Independence • The only difference is in Chi-Square Tests for Homogeneity & Independence is in how the data is collected: • Homogeneity – Two populations categorized on one categorical variable. • Independence – One population categorized on two categorical variables. Chi-Square Tests for Homogeneity & Independence • The test mechanics and everything else are the same for Homogeneity and Independence. • The alternative hypothesis is no longer one or two sided, it is many-sided. To test Ho, we compare the observed counts in a two-way table with the expected counts. Chi-Square Tests for Homogeneity & Independence • Expected cell count = row total x column total table total • 2 = (observed – expected)2 expected • observed are your sample values. • expected is calculated based on the null. • In a table with r rows and c columns df = (r – 1)(c – 1) Chi-Square Tests for Homogeneity & Independence • Calculator instructions: • 1) Enter the observed in matrix A: 2nd matrix, edit, choose matrix, enter size & cells • 2) Stat, tests, C: 2-test, enter name of observed matrix, enter name of matrix where you would like the expected to be stored, choose calculate to compute.