Chapter 11 Chi-Square and no ANOVA Tests Slide set to accompany "Statistics Using Technology" by Kathryn Kozak (Slides by David H Straayer) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at http://www.tacomacc.edu/home/dstraayer/published/Statistics/Book/StatisticsUsingTechnology112314b.pdf. Section 11.1: Chi-Square Test for Independence • Two categorical variables • Are they independent, or are they related? • If they are independent, there can not be any casual relationship. Chi-Square test for independence 1. Hypotheses and H0: the two variables are independent H1: the two variables are dependent 2. Assumptions: random sample, expected frequencies all 5 3. Test statistic and p-value The test statistic, 2 can be used to calculate a p-value 4. Conclusion. As always, if p-value < , reject the null hypothesis in favor of the alternate. There is sufficient evidence to support the alternate hypothesis that the categorical variables are dependent. Otherwise, there is not enough evidence to support the alternate hypothesis at the stated level . 5. Interpretation: what does this conclusion imply in the context of the problem? The test statistic 2 • This is a single number that captures the central question: how far is the observed data away from what we’d expect if the null hypothesis is true? • It is just the sum of the squares of the differences (but normalized to become a unitless number. OK, but what is that expectation? • Let’s work this discussion around an example. Autism Yes No Column Total Breast Feeding Timelines None Less 2 to 6 More than 2 months than 6 months months 241 20 261 198 25 223 164 27 191 215 44 259 Row Total 818 116 934 • Focus on one cell, and ask “what should we expect if Autism is independent of breastfeeding? Focus on “None” “Yes” cell • In this sample 818 out of 934 babies had Autism (Gee, that seems like a pretty high percentage to me!) . This works out to 87.5% • If there were no relationship between breastfeeding and autism, that percentage of the 261 babies with “none” breastfeeding would be expected to be autistic. 87.5% of (times) 261 is 228.6 (don’t get freaked out by the fractional baby) On the other hand… • What percentage of babies had “none” breastfeeding? It’s 261/934 27.9% • Out of those 818 babies with autism, we’d expect that percentage of them to have Autism: 27.9%*818, That same 228.6. • For each cell, the expected count is: 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 ∗ (𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙) (𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙) Now we have a matrix of expectations Expected Counts Breast Feeding Timelines Autism Yes No Column Total Less More 2 to 6 None than 2 than 6 months months months 228.6 195.3 167.3 226.8 32.4 27.7 23.7 32.2 261 223 191 259 Row Total 818 116 934 How “far” are these two matrices apart? • In each cell, subtract to calculate a distance. • Since we’re going to want to add them up to get a grand total distance, we need to square each difference first, to make all the numbers positive, just as we have done before in things like the standard deviation. • But we divide each square by the expected number first, before we add them up. This makes the resulting sum unit-less. 2 formula • 𝝌2 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑖𝑗 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑖𝑗 2 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑖𝑗 • For this data, 2 11.21 • That might mean something to Dilbert, but what does it tell our pointy-haired boss (and us) about how unusual it would be to get these numbers, if our null hypothesis is true? • That’s where a distribution function comes in Getting to the p-value • The area of a right tail of a distribution function is the p-value. • What is the distribution function? The 2 distribution function. • Actually, like the t-distribution, there is a whole family of 2 functions, one for each degree of freedom. d.f. is defines as (row count-1)*(column count -1) Shape of 2 distribution The right-tail on the TI • 2cdf(11.2166,1E99,3) 0.01061 • That’s more than 1%, which is a reasonable for this problem. So we’d have to fail to reject the null hypothesis. • “We don’t have enough evidence to meet our stated level of evidence to conclude that breastfeeding frequency is linked to autism.” Find an easier way using technology Put the observed data in here Give permission to put expected values here Section 11.2: Chi-Square Goodness of Fit • Sometimes, we want to test a single categorical value to see if it observed values match theoretical values. • For example, a casino wants to make sure roulette wheels and dice are “fair”. As long as any departures from “fairness” are very much smaller than the “house edge”, the gambling equipment is safe to use. FITTEST on the TI • Some TI84 firmware has a 2GOF-Test • The TI83s don’t have it. • But it’s not all that hard to calculate a 2 statistic, and it’s even easier with a little program. • Put your observed data in L2. If you have expected probabilities, put them in L1. If they are all equal, the program will do that for you. FITTEST, continued • The program will put expected counts in L3, and the terms that you add up to calculate 2 in L4. • The program reports the 2 statistics, and the corresponding p-value. • Most of the program is user-interface code, but the “guts” of the program is shown on the next slide. Babies Birthdays • http://www.dartmouth.edu/~chance/teaching_ai ds/data/birthday.txt Day Sunday Monday Tuesday Wednesday Thursday Friday 7 11 8 9 4 50 random 33 39 54 43 45 300 random 149 206 207 183 178 1300 random Saturday 5 43 203 6 43 174 • Do those weekend numbers seem a little small? • Do babies consult calendars before coming out? • Let’s assume babies don’t, and the chances of being born on all days are the same: 1/7 50 random babies selected 50 babies data interpreted • 2 = 4.8, p = 0.559 • This means there is a 56% chance of getting data like this if babies are equally likely to be born on any weekday. • “There is insufficient evidence that baby’s birth days are not as likely on any weekday. 300 or 1300 random babies • 300 baby data: 2 = 5.62, p = 0.467 Still not much evidence. This could happen by chance, easily. • 1300 baby data: 2 = 14.6, p = 0.0234 “With 1300 randomly-selected babies, we have evidence (p=0.0234) that babies birth days were not uniformly distributed among the weekdays.”