The Chi Square Equation - Haiku Learning : Login

advertisement
The Chi Square
Equation
Statistics in Biology
Background
• The chi square (χ2 ) test is a statistical test to compare
observed results with theoretical expected results
• The calculation generates a χ2 value; the higher the value,
the greater the difference between the observed and
expected results.
• The data used in calculating a chi square statistic must be
random, raw, mutually exclusive, drawn from independent
variables and be drawn from a large enough sample.
Use this test when:
• The measurements relate to the # of individuals in
particular categories
• The observed # can be compared with an expected #
which is calculated from a theory
• Ex: Hardy-Weinberg theory in evolution; Mendelian
probability in genetics (Punnett squares)
Steps to perform the test:
• 1. State the null hypothesis
• This is a negative statement saying that there is no
statistical difference between the observed and
expected results
• Ex: the data you observe is due only to chance; the
variable you tested did not yield significant results
Steps to perform the test:
• 2. Calculate the expected value
• This may be the mean of the expected values
• When studying inheritance, you add up the expected
values and apply a ratio (3:1 or 1:2:1 or 9:3:3:1)
Steps to perform the test:
• 3. Calculate χ2
• The formula is:
Steps to perform the test:
• 4. Know the degrees of freedom
• This is calculated using the formula (n-1)
• where n = the number of sets of results
Steps to perform the test:
• 5. Compare the χ2 value against a table of critical values
• Refer to the degrees of freedom
• Look up the critical number at the p=0.05 level
Steps to perform the test:
• 6. Make a conclusion:
• Biologists need to feel confidence in their results in order
to say that a difference occurred due to a biological
reason
• They will only accept this if they have a greater than 95%
confidence
• If they have less than 95% confidence, they are only
willing to say that the difference between the results
occurred due to chance alone
Making a Conclusion:
• If the number exceeds the critical number at the 0.05 level,
then as a biologist, you can reject the null hypothesis
• If the χ2 value is less than the critical number, then you can
accept the null hypothesis
• Ex: the calculated value is greater than the critical value
so the null hypothesis is rejected and there is a
significant difference between the observed and
expected results at the 5% level of probability
Simple Example:
Consider tossing a coin 100 times
• The expected result of tossing a fair coin 100 times is
that heads will come up 50 times and tails will come up
50 times.
• The actual result might be that heads comes up 45 times
and tails comes up 55 times.
• The chi square statistic will show any discrepancies
between the expected results and the actual results.
1. State the null hypothesis
• Our null hypothesis would be that the coin flipping is
statistically accurate and not the result of some outside
force (like a weighted coin)
2. Calculate the expected value
• 100 flips divided by 2 = 50 heads and 50 tails
3. Calculate
2
χ
• So we have observed 45 H and 55 T
• But we expected 50 H and 50 T
• Start with the first category – heads
• Observed (O) = 45
Expected (E) = 50
• Complete O-E 45-50 = -5
• Square the answer (-5)2 = 25
• Divide the answer by the number you had expected
25/50 = 0.5
3. Calculate
2
χ
• Continue with the second category –Tails
• Observed (O) = 55
Expected (E) = 50
• Complete O-E 55-50 = 5
• Square the answer (5)2 = 25
• Divide the answer by the number you had expected
25/50 = 0.5
3. Calculate
2
χ
• Add the two together to get the chi square(X2) value:
• 0.50 + 0.50 = 1.0
• So the X2 value is 1.0
4. Degrees of Freedom
• Degrees of freedom (df) = number of categories -1
• In this case:
• df= 2 categories (heads and tails) -1= 1
The Math
• We now have
the X2 value
(1.0) and the
Df (1)
• Look on
the chart
The Chart – Deciphering the Code
• Find the row that matches your degrees of freedom
• Move across that row until you get to either your number
or a range that your number would be in
• Look up at the number that is at the top of your column
– this represents the probability that your number can
happen naturally and is due to chance (accepting your
null hypothesis)
The Chart – Deciphering the Code
• With our X2 and
df, our number falls
between 0.5 and 0.1
• So the probability of
us getting the
numbers we saw
by chance would
be between
10 and 50%
The “magic” column: p=0.05
• In scientific research, the probability value of 0.05 is taken as the
common cut off level of significance.
• A probability value (p-value) of .05 means that there is a 5% chance
that the difference between the observed and the expected data is
a random difference, and a 95% chance that the difference is real
and repeatable — in other words, a significant difference.
• Therefore, if your p-value is greater than .05, you would accept
the null hypothesis:
• “The difference between my observed results and my expected
results are due to random chance alone and are not significant.”
• Based on our chi
square value, we
should accept our
null hypothesis
• We would expect
to get the
observed flip data
between 10 and
50% of the time
if we did the test
again
Accept the null hypothesis
1.0
Reject the null hypothesis
The Chart – Deciphering the Code
• If we had calculated a
value of 6.0 instead of
1.0, we would expect to
get that value between
2.5 and 1% of the time
• That is very rare!
• It’s so rare, there must be
something other than
chance happening in
order for us to get that
value again
• We should reject the
hypothesis that
“nothing” is happening
Accept the null hypothesis
Reject the null hypothesis
6.0
A helpful *revised Chi Square Table
Degrees
Probability Values (p)
of
Deviation from Hypothesis Not Significant
Freedom
0.95
0.80
0.70
0.50
0.30
0.20
0.10
(n)
1
0.004
0.06
0.15
0.46 1.07
1.64 2.71
2
0.10
0.45
0.71
1.30 2.41
3.22 4.60
3
0.35
1.00
1.42
2.37 3.67
4.64 6.25
4
0.71
1.65
2.20
3.36 4.88
5.99 7.78
5
1.14
2.34
3.00
4.35 6.06
7.29 9.24
6
1.64
3.07
3.38
5.35 7.23
8.56 10.65
7
2.17
3.84
4.67
6.35 8.38
9.80 12.02
Chi Square value consistent with null hypothesis= accept null
Deviation
Significant
0.05
Deviation Highly
Significant
0.01
0.005
3.84
6.64
7.88
5.99
9.21
10.59
7.82
11.34
12.38
9.49
13.28
14.86
11.07
15.09
16.75
12.59
16.81
18.55
14.07
18.48
20.28
Not consistent= reject null
Practice Analysis Questions:
1) Suppose you were to obtain a Chi-square value of 7.82
or greater in your data analysis (with 2 degrees of freedom).
What would this indicate?
2) Suppose you were to obtain a Chi-square value of 4.60
or lower in your data analysis (with 2 degrees of freedom).
What would this indicate?
Download