DICE! You are going to make your own and then we are going to test them (later) to see if they are fair! Chapter 11 Chi-Squared (Categorical Data) Chi-Squared The chi-squared goodness of fit test allows us to determine whether a hypothesized distribution seems valid. (multiple variables in a distribution – not just one) Two types: Chi-squared for homogeneity – tells us whether the distributions differ when there is a treatment/experiment involved. Chi-squared for association – tells us whether the distributions differ in an observational study. Chi-Squared Statistic 𝑥2 = (𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 Carrying Out a Test Conditions – Random, Large Sample Size (all counts are at least 5), Independent (individual observations are independent. When sampling without replacement, 10% of population rule) 𝐻0: The specified distribution is correct 𝐻𝑎: The specified distribution is not correct 𝑥2 = Find (𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 p-value with df=k-1 (number of categories -1) The chi-squared test statistic compares observed minus expected COUNTS. Don’t try to perform calculations with observed minus expected proportions in each category! The checking large sample size condition, be sure to examine the EXPECTED counts, not the observed counts. When were you born? Are births evenly distributed across the days of the week? The one-way table shows results of a random sample of 140 births from local records. Days Sun Mon Tues Wed Thurs Fri Sat Births 13 23 24 20 27 18 15 Do these data give significant evidence that local births are not equally likely on all days of the week? State: Plan: Do: Conclude: Follow-Up Analysis When you reject your null hypothesis…you need to follow up by looking at individual components to see which values are the biggest contributors – or helped to push your data far enough to reject. These components show which terms contribute most to the chi-squared statistic. Homework Pg 692 (1-9 odd, 11, 15, 19-22) Let’s do #11 together! #11