Chi Square Analysis

advertisement
UNIT 6: MENDELIAN GENETICS
CHI SQUARE ANALYSIS
Ms. Gaynor
Honors Genetics
CHI SQUARE ANALYSIS

The chi square analysis allows you to use
statistics to determine if your data is
“good” or not


Is your data “good” enough to accept your
hypothesis?
allows us to test for deviations of
observed frequencies from expected
frequencies

The following formula is used
You need 2 different hypotheses:
1. NULL Hypothesis
•Data are occurring by chance and it is all
RANDOM! There is NO preference between the
groups of data.
2. Alternative Hypothesis
•Data are occurring by someoutlside force. It is
NOT by chance and it is NOT RANDOM! There
is preference between the groups of data.

This statistical test is compared to a theoretical
probability distribution

These probability (p) values are on the Chi Square
distribution table
HOW DO YOU USE THIS TABLE
PROPERLY?
you need to determine the degrees of freedom


Degrees of freedom is the # of groups
(categories) in your data minus one (1)
If the level of significance read from the table is
less than .05 or 5% then your hypothesis
is accepted and the data is useful…the data is
NOT due to randomness!
Two Types of Hypotheses:
1. NULL HYPOTHESIS



states that there is no substantial
statistical deviation between observed and
expected data.
 a hypothesis of no difference (or no effect) is
called a null hypothesis symbolized H0
In other words, the results are
totally random and occurred by
chance alone. There is NO preference.
The null hypothesis states that the two variables
are independent, or that there is NO relationship
to one another.
Null Hypothesis Example




A scientist studying bees and butterflies.
Her hypothesis was that a single bee visiting a
flower will pollinate with a higher efficiency than a
single butterfly, which will help produce a greater
number of seeds in the flower bean pod.
We will call this hypothesis H1 or an alternate
hypothesis because it is an alternative to the null
hypothesis.
What is the null hypothesis?
 H0: There is no difference between bees and
butterflies in the number of seeds produced by
the flowers they pollinate.
Two Types of Hypotheses:
2. ALTERNATIVE HYPOTHESIS

states that there IS a substantial
statistical deviation between observed
and expected data.
 a hypothesis of difference (or effect) is
called a alternative hypothesis symbolized
H1

In other words, the results are
affected by an outside force and
are NOT random and did NOT occur
by chance alone. There is a
preference.
2 Types of Chi Square Problems
1.
2.
Non-genetic

Null Hypothesis:

Data is due to chance and is completely random. There is no
preference between the groups/categories.

Alternative Hypothesis

Data is NOT due to chance and there IS a preference between
the groups/categories. Data is not random.
Genetic

Null Hypothesis:

Data is due to chance and is random due to independent
assortment being random. Punnett square ratios are expected.


If there are 2+ genes involved in the experiment…There is no gene
linkage affecting independent assortment & segregation. Punnett square
ratios are expected.
Alternative Hypothesis

Data is due NOT to chance and is NOT random. Punnett square
ratios are NOT expected.

If there are 2+ genes …There IS gene linkage affecting independent
assortment & segregation
Let’s look at a fruit fly
cross and their phenotypes
x
Black body, eyeless
(bbee)
F1: all wild type
(BbEe)
Wild type
(BBEE)
F1 x F1
5610
Wild type
1896
Black body, eyeless
1881
Eyeless, Wild type
622
Black body, Wild type
Analysis of the results



Once the numbers are in, you have to
determine the expected value of this cross.
This is your hypothesis called the null
hypothesis (no gene linkage is occuring).
What are the expected outcomes of this
cross?
F1 Cross: BbEe
x
BbEe




9/16 should be wild type (normal body, wildtype eyes)
3/16 should be normal body eyeless
3/16 should be black body wild eyes
1/16 should be black body eyeless.

The following formula is used
If your null hypothesis is supported by data
•you are claiming that mating is random as well as
segregation and independent assortment.
If your null hypothesis is not supported by data
•you are seeing that the deviation (difference)
between observed and expected is very far apart 
something non-random must be occurring…GENE
LINKAGE!!!
Now Conduct the Analysis:
To compute the hypothesis value
take 10009/16 = 626
(a.k.a- 1/16 of total offspring)
Now Conduct the Analysis:
Remember: To compute the
hypothesis value take 10009/16 = 626

1.
Using the chi square formula compute the chi
square value (χ2) for this cross:
Calculate (o-e)2/ e for EACH phenotype




2.
Sum all numbers to get your chi square value


(5610 - 5630)2/ 5630 = .07
(1881 - 1877)2/ 1877 = .01
(1896 - 1877 )2/ 1877 = .20
(622 - 626) 2/ 626 = .02

2
= .30
Determine how many degrees of freedom are in
your experiment

4 (phenotype) groups– 1 = 3
I Have my Chi Square
Value (X2)….What next?

Figure out which hypothesis is accepted:


your NULL hypothesis= 9:3:3:1 ratio is seen due
to non-linkage genetics (independent
assortment/ segregation is occuring)
The alternative hypothesis = any change from
the expected is due to SOME OUTSIDE FORCE!


IT IS NOT RANDOM!  THE GENES ARE LINKED!
To figure which hypothesis is accepted, you
need to use the CHI SQUARE TABLE, which
list CRITICAL VALUES!
 This
value is useful b/c we
can obtain the
probability that the data
occurs (and the probability
that the data are an error)
CHI SQUARE TABLE
CHI-SQUARE DISTRIBUTION TABLE
Reject Null
Hypothesis
Accept Null Hypothesis
(chance ONLY)
(NOT chance
ONLY)
Probability (p)
Degrees of
Freedom
0.95
0.90
0.80
0.70
0.50
0.30
0.20
0.10
0.05
0.01
0.001
1
0.004
0.02
0.06
0.15
0.46
1.07
1.64
2.71
3.84
6.64
10.83
2
0.10
0.21
0.45
0.71
1.39
2.41
3.22
4.60
5.99
9.21
13.82
3
0.35
0.58
1.01
1.42
2.37
3.66
4.64
6.25
7.82
11.34
16.27
4
0.71
1.06
1.65
2.20
3.36
4.88
5.99
7.78
9.49
13.38
18.47
5
1.14
1.61
2.34
3.00
4.35
6.06
7.29
9.24
11.07
15.09
20.52
6
1.63
2.20
3.07
3.83
5.35
7.23
8.56
10.64
12.59
16.81
22.46
7
2.17
2.83
3.82
4.67
6.35
8.38
9.80
12.02
14.07
18.48
24.32
8
2.73
3.49
4.59
5.53
7.34
9.52
11.03
13.36
15.51
20.09
26.12
9
3.32
4.17
5.38
6.39
8.34
10.66
12.24
14.68
16.92
21.67
27.88
10
3.94
4.86
6.18
7.27
9.34
11.78
13.44
15.99
18.31
23.21
29.59
In biological applications, a probability 5% is usually adopted as
the standard conventional criteria for probability to have statistical
significance is 0.001-0.05
What level of probability does one choose to
decide whether two groups differ as a result
of NON-CHANCE events or simply because
of CHANCE?
•most scientists have decided:
•If difference between 2+ groups is so great
that it would happen by chance fewer
than 1 out of 20 times ("P" < 0.05), then the
groups differ significantly.
•That is, the null hypothesis (due to
chance/no difference in data) is rejected.
•If greater confidence in the results is
desired, scientists will choose probability
levels of less than 1 in 100 (P < .01) or 1 in
1000 (P < 0.001).

Looking statistical values up on the chi square
distribution table tells us the following:



the PROBABILITY (P) value read off the
table places our chi square value of 0.30
closer to .95 or 95% (~94%)
This value means that there is a 6% chance that
our results are biased and due to gene linkage.
In other words, the probability of getting our
results is 94%.


94% of the time when our observed data is this close to
our expected data, this deviation is due to random
chance.
We therefore accept our null hypothesis.

When reporting chi square data
use the following formula
sentence….
With ?
degrees of freedom,
my chi square value is
?
, which gives me a p value
between ?__% and ?__%, I
therefore
(accept or reject)
my null hypothesis.
PRACTICE PROBLEMS

What is the critical value at which we
would reject the null hypothesis for the
fruit fly example earlier?


For 3 degrees of freedom the value for
our chi square must be > 7.815 to accept
the alternative hypothesis and support
that gene linkage is occurring.
What if our chi square value was 8.0 with 4
degrees of freedom, do we accept or reject
the null hypothesis?

Accept, since the critical value is >9.48
with 4 degrees of freedom.
Download