BS 105 Statistical Analysis Chi-Square Test Statistics can be used to determine if differences among groups are significant, or simply the result of chance. A common statistical method used to determine if observed experimental data are a good approximation to the expected or theoretical data is the Chi-square test. In short, this test can determine if deviations from the expected values are due to chance alone, or to the effect of the independent variable. When conducting an experiment, a researcher first states the hypothesis about how the experiment will turn out. The hypothesis is typically stated as two possible outcomes, the null hypothesis (H0) of no effect and the alternate hypothesis (H1) of an effect. Based upon the data, the researcher will then accept one hypothesis and reject the other. An example is given below. A study was conducted of freshmen students to see if there would be an improvement in individual test scores when the students were placed in supervised study groups at the start of the semester. Half of the students were assigned to study groups and half were not. Null hypothesis (H0) = Placement in study groups will result in no improvement in student test scores. Alternate hypothesis (H1) = Placement in study groups will result in improvement in student test scores. To determine if the observed data fall within acceptable limits; a Chi-square analysis is performed to test the validity of the null hypothesis; that there is no statistically significant difference between the observed and expected data. If the Chi-square analysis indicates that the data vary too much from the expected results, then the alternate hypothesis is accepted. The formula for Chi-square is: Χ2 = ∑ (o-e)2 e where o = observed number of individuals e = expected number of individuals ∑ = sum of the values (in this case, the differences, squared, divided by the number expected) An Example Experiment and Sample Calculations: In mammals, sex determination (male vs. female) of the offspring occurs at fertilization. This is not the case for many species of reptiles, which show temperature-dependent sex determination (TSD). In TSD reptiles, the temperature at which the eggs or embryos are incubated during development determines the sex or gender of the offspring. The relationship between incubation temperatures and gender is not consistent across TSD reptile species. In some species, cool incubation temperatures result in more males and in other species more females. Similarly, in some species warm incubation temperatures result in more males and in other species more females. The paragraph above is based on experimental evidence and is therefore scientifically supported. The paragraph below is based on fiction – for the purpose of illustrating the use of Chi-square in evaluation of data. A researcher is working with a turtle species for which there is limited information available on reproduction. The researcher knows that eggs incubated at 29oC will produce equal numbers of male and female offspring. The researcher wants to determine if this is a TSD species and if so, what the effect of warm incubation temperatures might be on sex of the offspring. The researcher proposes two hypotheses: Null hypothesis (H0) = Incubation of eggs at a warm temperature will not affect the sex ratio of offspring. Alternate hypothesis (H1) = Incubation of eggs at a warm temperature will result in an altered sex ratio of offspring. The researcher obtains 200 eggs of the turtle species and incubates them at 31.5 oC. At hatching there are 80 male and 120 female offspring. The researcher performs a Chi-square analysis of the data, as indicated below: (o-e)2 e Observed (o) Expected (e) (o-e) (o-e)2 Male 80 100 -20 400 4.00 Female 120 100 20 400 4.00 Sex/Gender Χ2 = 8.00 The Chi-square value is simply the sum of the final column. This Χ2 value is then compared to the following table. Critical Values of the Chi-square Distribution DEGREES OF FREEDOM (df) Probability (p) 1 2 3 4 5 0.05 3.84 5.99 7.82 9.49 11.1 0.01 6.64 9.21 11.3 13.2 15.1 0.001 10.8 13.8 16.3 18.5 20.5 How to use the Critical Values Table: 1. Determine the degrees of freedom (df) for the experiment. This is simply the number of categories minus 1. Since there are two possible categories for this example (male and female), the degrees of freedom is 1 (2 – 1). 2. Find the p value. Under the 1 df column find the critical value in the probability (p) = 0.01 row: it is 6.64. What does this mean? If the calculated Chi-square value is greater than or equal to the critical value from the table, then the null hypothesis is rejected. Since our calculated Χ2 value is 8.00 and 8.00 > 6.64, we reject the null hypothesis that there is no statistically significant difference between the observed and expected data. In other words, chance alone cannot explain the deviations the researcher observed. In the sciences the minimum probability for rejecting a null hypothesis is generally 0.05. 3. The results of the turtle experiment are said to be significant at a probability of p = 0.01. This means that only 1% of the time would you expect to see similar data if the null hypothesis was correct. Therefore, you are 99% certain that the data do not fit the expected 1:1 ratio. 4. If the calculated value was 4.5 then the null hypothesis would still be rejected, but this time at a probability of p = 0.05 (4.5 > 3.84, but < 6.64). This means that less than 5% of the time would you expect to collect the observed data if the null hypothesis was correct. Stated differently, you would be 95% sure that the data do not fit the expected 1:1 ratio. 5. Since these data do not fit the expected 1:1 ratio, you must consider reasons for this variation. The logical conclusion here is that the warmer temperature has caused a significant difference in the number of male vs. female offspring. In other words, the data support the alternate hypothesis. Applying this information to the “Pill Bug” Experiment: 1. The number of categories in each part of the experiment is always two (acid vs. base; food vs. no-food, wet vs. dry, light vs. shade). 2. Determine the expected values! Since we have no reason to expect anything otherwise, we assume that half of the pill bugs will be in one chamber and half in the other (in other words, a 1:1 ratio), regardless of the particular treatment. So…total the number of pill bugs for a given treatment (this is actually done for you on the data sheet) and divide that number in half to calculate the expected number of pill bugs in each side of the choice chamber.