The Chi Square Equation Statistics in Biology Background • The chi square (χ2 ) test is a statistical test to compare observed results with theoretical expected results • The calculation generates a χ2 value; the higher the value, the greater the difference between the observed and expected results. • The data used in calculating a chi square statistic must be random, raw, mutually exclusive, drawn from independent variables and be drawn from a large enough sample. Use this test when: • The measurements relate to the # of individuals in particular categories • The observed # can be compared with an expected # which is calculated from a theory • Ex: Hardy-Weinberg theory in evolution; Mendelian probability in genetics (Punnett squares) Steps to perform the test: • 1. State the null hypothesis • This is a negative statement saying that there is no statistical difference between the observed and expected results • Ex: the data you observe is due only to chance; the variable you tested did not yield significant results Steps to perform the test: • 2. Calculate the expected value • This may be the mean of the expected values • When studying inheritance, you add up the expected values and apply a ratio (3:1 or 1:2:1 or 9:3:3:1) Steps to perform the test: • 3. Calculate χ2 • The formula is: Steps to perform the test: • 4. Know the degrees of freedom • This is calculated using the formula (n-1) • where n = the number of sets of results Steps to perform the test: • 5. Compare the χ2 value against a table of critical values • Refer to the degrees of freedom • Look up the critical number at the p=0.05 level Steps to perform the test: • 6. Make a conclusion: • Biologists need to feel confidence in their results in order to say that a difference occurred due to a biological reason • They will only accept this if they have a greater than 95% confidence • If they have less than 95% confidence, they are only willing to say that the difference between the results occurred due to chance alone Making a Conclusion: • If the number exceeds the critical number at the 0.05 level, then as a biologist, you can reject the null hypothesis • If the χ2 value is less than the critical number, then you can accept the null hypothesis • Ex: the calculated value is greater than the critical value so the null hypothesis is rejected and there is a significant difference between the observed and expected results at the 5% level of probability Simple Example: Consider tossing a coin 100 times • The expected result of tossing a fair coin 100 times is that heads will come up 50 times and tails will come up 50 times. • The actual result might be that heads comes up 45 times and tails comes up 55 times. • The chi square statistic will show any discrepancies between the expected results and the actual results. 1. State the null hypothesis • Our null hypothesis would be that the coin flipping is statistically accurate and not the result of some outside force (like a weighted coin) 2. Calculate the expected value • 100 flips divided by 2 = 50 heads and 50 tails 3. Calculate 2 χ • So we have observed 45 H and 55 T • But we expected 50 H and 50 T • Start with the first category – heads • Observed (O) = 45 Expected (E) = 50 • Complete O-E 45-50 = -5 • Square the answer (-5)2 = 25 • Divide the answer by the number you had expected 25/50 = 0.5 3. Calculate 2 χ • Continue with the second category –Tails • Observed (O) = 55 Expected (E) = 50 • Complete O-E 55-50 = 5 • Square the answer (5)2 = 25 • Divide the answer by the number you had expected 25/50 = 0.5 3. Calculate 2 χ • Add the two together to get the chi square(X2) value: • 0.50 + 0.50 = 1.0 • So the X2 value is 1.0 4. Degrees of Freedom • Degrees of freedom (df) = number of categories -1 • In this case: • df= 2 categories (heads and tails) -1= 1 The Math • We now have the X2 value (1.0) and the Df (1) • Look on the chart The Chart – Deciphering the Code • Find the row that matches your degrees of freedom • Move across that row until you get to either your number or a range that your number would be in • Look up at the number that is at the top of your column – this represents the probability that your number can happen naturally and is due to chance (accepting your null hypothesis) The Chart – Deciphering the Code • With our X2 and df, our number falls between 0.5 and 0.1 • So the probability of us getting the numbers we saw by chance would be between 10 and 50% The “magic” column: p=0.05 • In scientific research, the probability value of 0.05 is taken as the common cut off level of significance. • A probability value (p-value) of .05 means that there is a 5% chance that the difference between the observed and the expected data is a random difference, and a 95% chance that the difference is real and repeatable — in other words, a significant difference. • Therefore, if your p-value is greater than .05, you would accept the null hypothesis: • “The difference between my observed results and my expected results are due to random chance alone and are not significant.” • Based on our chi square value, we should accept our null hypothesis • We would expect to get the observed flip data between 10 and 50% of the time if we did the test again Accept the null hypothesis 1.0 Reject the null hypothesis The Chart – Deciphering the Code • If we had calculated a value of 6.0 instead of 1.0, we would expect to get that value between 2.5 and 1% of the time • That is very rare! • It’s so rare, there must be something other than chance happening in order for us to get that value again • We should reject the hypothesis that “nothing” is happening Accept the null hypothesis Reject the null hypothesis 6.0 A helpful *revised Chi Square Table Degrees Probability Values (p) of Deviation from Hypothesis Not Significant Freedom 0.95 0.80 0.70 0.50 0.30 0.20 0.10 (n) 1 0.004 0.06 0.15 0.46 1.07 1.64 2.71 2 0.10 0.45 0.71 1.30 2.41 3.22 4.60 3 0.35 1.00 1.42 2.37 3.67 4.64 6.25 4 0.71 1.65 2.20 3.36 4.88 5.99 7.78 5 1.14 2.34 3.00 4.35 6.06 7.29 9.24 6 1.64 3.07 3.38 5.35 7.23 8.56 10.65 7 2.17 3.84 4.67 6.35 8.38 9.80 12.02 Chi Square value consistent with null hypothesis= accept null Deviation Significant 0.05 Deviation Highly Significant 0.01 0.005 3.84 6.64 7.88 5.99 9.21 10.59 7.82 11.34 12.38 9.49 13.28 14.86 11.07 15.09 16.75 12.59 16.81 18.55 14.07 18.48 20.28 Not consistent= reject null Practice Analysis Questions: 1) Suppose you were to obtain a Chi-square value of 7.82 or greater in your data analysis (with 2 degrees of freedom). What would this indicate? 2) Suppose you were to obtain a Chi-square value of 4.60 or lower in your data analysis (with 2 degrees of freedom). What would this indicate?