Chi-Square Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Michael J. Kalsher Department of Cognitive Science PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher 1 of 27 Outline • Review of Important Definitions and Concepts • Chi-Square Tests • Goodness of Fit • Test of Independence • Sample Problems 2 of 27 What are Non-parametric Statistics? Methods of analyzing data that examine the relative position or rank of the data rather than the actual values. Non-parametric statistics do not: • assume that the data come from a normal distribution. • create any parameter estimates (e.g., means; standard deviations) to assess whether one set of numbers is statistically different from another set of numbers. 3 of 27 Chi-Square Goodness of Fit This test is used to evaluate questions concerning the probabilities associated with each value of a variable by comparing an observed frequency distribution to an expected frequency distribution. It’s most often used when a person has the observed frequencies for several mutuallyexclusive categories and wants to decide if they have occurred equally frequently. 4 of 27 Chi-Square Goodness of Fit: Test Assumptions 1. Random and independent sampling. 2. Sample size must be sufficiently large 3. Values of the variable are mutually exclusive and exhaustive. Every subject must fall in only one category. Note: If these values are not met, the critical values in the chi-square table are not necessarily correct. 5 of 27 Chi-Square Goodness of Fit: Computing by hand Note: df = (k - 1) 6 of 27 Critical Values for the Chi-Square Test 7 of 27 Chi-Square Test of Independence This test is used to examine whether two or more variables are related based on information about probabilities. It assesses whether observed frequencies of events differ from those that would be expected by chance. One common use is to determine whether there is an association between two independent variables. 8 of 27 The Chi-Square Test of Independence: Computing by hand Note: df = (k - 1)(q - 1) 9 of 27 Critical Values for the Chi-Square Test 10 of 27 Chi-Square Example Using SPSS A researcher is interested in whether cats can be trained to line dance. He recruits 200 cats and then tries to train them to line dance by giving them either food or affection as a reward for “dance-like” behavior. At the end of the week he counts how many of the cats could line dance and how many cannot. We have two categorical variables: Training (food vs. affection) and Dance (each cat learned to dance or it did not). Open cat.sav 11 of 27 12 of 27 13 of 27 14 of 27 15 of 27 16 of 27 How big is the effect?: Cramer’s V Cramer’s V = .36, p<.01. .36 out of 1 = a medium association between type of training and whether the cats dance. Can be viewed like a correlation coefficient. The significance level indicates it is highly unlikely the observed pattern of data is due to chance. 17 of 27 How big is the effect?: Odds Ratio 1. Calculate odds that a cat danced given they had food as a reward. Odds(dancing after food) = number that had food and danced = 28/10 number that had food but didn’t dance = 2.8 2. Calculate odds a cat danced given they had affection as a reward. Odds(dancing after affection) = number that had affection and danced = 48/114 number that had affection but didn’t dance = 0.421 3. Calculate odds ratio. Odds(dancing after food) Odds(dancing after affection) = 2.8 / 0.421 = 6.65 There was a significant association between the type of training and whether or not cats would dance, 2(1)= 25.36, p<.001. This seems to represent the fact that, based on the odds ratio, the odds of cats dancing were 6.65 times higher if they were trained with food than if trained with affection. 18 of 27 Alternative Method of Data Entry 19 of 27 20 of 27 This process tells the computer that it should weight each category combination by the number in the column labeled Frequency. So, for example, the computer “pretends” there are 28 rows of data that have the category combination 0,0, representing cats trained with food and that danced). 21 of 27 22 of 27 Chi-Square Example: Computing expected frequencies for hand-computation A researcher wants to know if there is a significant difference in the frequencies with which males come from small, medium, or large cities as contrasted with females. The two variables we are considering here are hometown size (small, medium, or large) and sex (male or female). Another way of putting our research question is: Is gender independent of size of hometown? The data for 30 females and 6 males is in the following table. Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36 23 of 27 The formula for chi-square is: Where: O is the observed frequency, and E is the expected frequency. The degrees of freedom for the 2-D chi-square statistic is: df = (Columns - 1) x (Rows - 1) 24 of 27 Computing Expected Frequencies Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36 Expected Frequency for each Cell: The cell’s Column Total X the cell’s Row Total / Grand Total In our example: Column Totals are 14 (small), 15 (medium), and 7 (large). Row Totals are 30 (female) and 6 (male). Grand total is 36. 25 of 27 Computing Expected Frequencies Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36 The expected frequency: 1. Small female cell: 14 X 30 / 36 = 11.667 2. Medium female cell: 15 X 30 / 36 = 12.500 3. Large female cell: 7 X 30 / 36 = 5.833 4. Small male cell: 14 X 6 / 36 = 2.333 5. Medium male cell: 15 X 6 / 36 = 2.500 6. Large male cell: 7 X 6 / 36 = 1.167 26 of 27 Observed frequencies, expected frequencies, and (O - E)2/E for males and females from small, medium, and large cities Small Observed Expected Medium Large Totals (O(O(OObserved Expected Observed Expected 2 2 E) /E E) /E E)2/E Female 10 11.667 0.238 14 12.500 0.180 6 5.833 0.005 30 Male 4 2.333 1.191 1 2.500 0.900 1 1.167 0.024 6 Totals 14 15 7 36 27 of 27