Genetics Workshop Number Three

advertisement
Genetics
Workshop
Number
Three
The ChiSquare
Welcome to your third Genetics Workshop. Here we will go slowly and
systematically through the steps needed to do chi-square analysis. This Workshop is
divided into three different problems and a summary of general ideas. The first one is
an analysis of the results from a monohybrid cross and I will walk you though it very
slowly. We will go over the fundamentals of chi-square work and reinforce what you
learn in the lesson. The second chi-square problem is another monohybrid problem. I
will move a little faster while showing you how to organize your math and that should
give you a better feel for how to actually do a chi-square. The last chi-square involves
a dihybrid cross so it is a little bit harder but it uses the same ideas as a simpler chisquare problem and you will see that it helps to set up a table in order to keep track of
the numbers. We will conclude this workshop with a short series of questions to help
you to understand that you can use chi-square analysis for more than simply testing
the ratios from crosses.
By the way, don't get chi-square mixed up with Punnett squares.
Punnett squares are a useful way to simulate how alleles come together in a mating
and they result in ratios of the different genotypes of offspring. Punnett squares
produce expected ratios of genotypes which you can now, easily, transform into ratios
of phenotypes.
A chi-square is a statistical tool that helps us to decide if the observed ratio is close
enough to the expected ratio to be acceptable. Chi-square analysis can be used in any
area, not just genetics. Whenever you have to determine if an expected ratio fits an
observed ratio, you can use the chi-square.
Before we begin, pick up and print out a copy of your Chi-Square Worksheet. Fill it in
as we work together and, when you are done each Part, check your answers with
mine. (The hyperlink for the answer sheet is at the end of this page.) There is a single
worksheet for this entire (three part) Workshop and a single answer sheet too. Just do
it a section at a time.
After this Workshop you will be well prepared to do the SAQs from the chi-square
lessons.
Chi-square of a monohybrid cross as a "walk
through"
Mendel's data from one experiment was ...
P = smooth seeds crossed with wrinkled seeds
F1 = all smooth seeds (so smooth is dominant and wrinkled is recessive)
F2 = 5,474 smooth seeds and 1,850 wrinkled seeds
1. What ratio did he observe?
5474 / 1850 = 2.9589189 : 1 = 2.96 : 1
2. What ratio did he expect?
3:1
You should understand that the chi-square compares the NUMBER (not ratio)
observed to the NUMBER (not ratio) expected. You are given the observed numbers
and from that data you might guess what the ratio should be. You then use that
"guessed" ratio to calculate what the expected numbers would by from that guessed
ratio.
Calculating the expected number is critical to doing the chi-square and many students
have trouble with that first step - they forget how to do it, use it backwards or don't do
it at all!
Let's work through this important step together so you will understand that logic.
You already know the number observed.
Smooth = 5474
Wrinkled = 1850
3. What is the total number of seeds?
7324
4. What number of wrinkled is expected?
7324 / 4 = 183
5. What number of smooth is expected?
1831 X 3 = 5493 or 7324 X 3/4 = 5493
OK, you now have the expected numbers calculated from the expected ratio.
The best (easiest) way to COMPARE two values is to find their DIFFERENCE (by
SUBTRACTION).
6. What is the difference between observed and expected smooth?
5474 - 5493 = -19
7. What is the difference between observed and expected wrinkled?
1850 - 1831 = 19
For "statistical magnification" we INCREASE those differences by squaring them.
8. What is the square of the difference between the observed and expected smooth?
-192 = 361 or -19 X -19 = 361
9. What is the square of the difference between the observed and expected wrinkled?
192 = 361 or 19 X 19 = 361
These "square of the differences" are too large and must be "NORMALISED" by
dividing each by the number EXPECTED (NOT the number observed). This could be
called the "squared differences per expected".
10. What is the square of the difference between the observed and expected smooth,
divided by the expected number of smooth?
361 / 5493 = 0.06572 = 0.066
11. What is the square of the difference between the observed and expected wrinkled,
divided by the expected number of wrinkled?
361 / 1831 = 0.19716 = 0.197
Lastly, we add together these "squared differences per expected" to give us the
TOTAL "squared differences per expected".
12. What is the sum of the "squared differences per expected"?
0.066 + 0.197 = 0.263 the
2
= 0.263
Therefore, the chi-square for this experiment is
2
= 0.263.
OK - so what?
Statisticians have developed chi-square tables, based upon the probabilities that a
particular chi-square value will come about purely by chance. There are two
"features" to consider.
A. Significance Level….
We (scientists) like to use the level of 5% as our significant "cut-off". Any chi-square
larger than the value from the 5% Table indicates an experiment in which the ratios
observed are so far off the ratios expected that we have to conclude that the ratios
expected are wrong!
B. Degrees of Freedom…
The more "classes" (categories) the more likely that a statistical "blip" will increase
the acceptable limits of the chi-square. The "degrees of freedom" are one less than the
number of classes.
13. Name all the different classes in the experiment (earlier)…..
Smooth and Wrinkled
14. How many degrees of freedom were in that experiment?
2-1=1
One degree of freedom.
Degrees of
Freedom
5 % Significance
Levels
1
3.84
2
5.99
3
7.81
4
9.49
Here's a portion of the Chi Square
Significance Table.
15. Is the chi-square you calculated within
the boundary of "the possible"?
Yes! We calculated a 2 = 0.263. With one degree of freedom we could have a chisquare up to 3.84 before we would become suspicious that the observed data was in a
ratio too far removed from the ratio we tested.
Chi-square of a monohybrid cross as a quick table
When doing a Chi-square it helps to set it up as a table and to understand that all we
have been doing is represented by the equation 2 = [(O - E)2/E]
Consider these results among the F2s
4,400 yellow seeds
1,624 green seeds
First, set up a table like the one below
Phenotypes
O
E
O-E
(O-E)2
(O-E)2
E
Yellow
Green
Total
Second, enter the data. Remember, data is what is observed. So data goes in the
"observed" (O) column.
Phenotypes
O
Yellow
4400
Green
1624
Total
6024
E
O-E
(O-E)2
(O-E)2
E
Next you fill in the "expected" (E) column. Using the total as a starting point divide
that number into the two sets of data that would produce the 3 to 1 ratio you expect.
Note that it might be easier to do the 1 (green) of the 3 :1 ratio first. However, if you
are comfortable with fractions it shouldn't be too hard to do them in any order.
Phenotypes
O
E
Yellow
4400
6024 X 3/4
4518
Green
1624
6024 X 1/4
O-E
(O-E)2
(O-E)2
E
1506
Total
6024
6024
Notice that the total expected is the same as the total observed. If they don't add up to
the same number you have made an error in the math.
Now fill in the rest of the table. It's a lot of work but, now that you have it all
organized, it should be just a matter of using your calculator correctly. There is no
reason to "total" columns O-E or(O-E)2 so leave them blank. However, it is very
important to complete the "total" in the last column, (O-E)2/E, because that is the
chi-square!
Fill in the rest of the table.
O
E
O-E
(O-E)2
(O-E)2
E
Yellow
4400
6024 X 3/4
4518
4400 4518
-118
-1182
13,924
13924 /
4518
3.08
Green
1624
6024 X 1/4
1506
1624 1506
-118
-1182
13,924
13924
/1506
9.24
Total
6024
6024
Phenotypes
12.32
Is the chi-square you calculated here within the boundary of "the possible"?
(To answer that, first go back to the Chi Square Significance Table you saw earlier.
Then page back down to here.)
NO! 2 = 12.32 but, with one degree of freedom we cannot accept any ratio that gives
us a chi-square larger than 3.84.
Do we accept that these results are within acceptable range of a 3 : 1 ratio?
No! We must reject the 3 : 1 ratio. This data is far off the 3 : 1 ratio.
Chi-square of a dihybrid cross as a quick table
Consider these results from a dihybrid cross
30 red tall
65 white tall
83 red short
206 white short
Before we dive into the chi-square we have to first determine what ratio we will test
and which category (class) fits with each part of the ratio.
Based upon these numbers, which phenotypes are dominant and recessive for the two
loci? (Remember, these are the F2s from a dihybrid cross so they should be close to a
specific ratio that you learned earlier. And you also learned which traits end up in
each part of that ratio.)
Also, as best you can, assign genotypes to these phenotypes.
A dihybrid cross should produce a 9 : 3 : 3 :1 ratio in the F2s and a simple look at the
numbers will give you an idea of which belongs to each category.
The biggest group is the white shorts so they must be the doubly dominant class. In
other words, white shorts can be assigned the genotype W-S-.
On the opposite end of the ratio, the least represented group, would be the doubly
recessive so the red talls are the "1" in the 9 : 3 : 3 :1 ratio and have the genotype
wwss.
You can deduce the other two classes, making up the "3" in the ratio. The white talls
have the genotype W-ss and the red shorts are wwS-.
Now that you have identified each category and assigned it to the ratio, we can begin
the chi-square to determine if it fits.
Let's begin by first arranging our computation table. It will be twice the size of the
previous table. It might help to arrange them in the table in a descending order to
represent the 9 : 3 : 3 : 1 ratio. Draw the appropriate table including the observed
numbers.
Phenotypes
O
White and
short
(W-S-)
206
Red and
short
(wwS-)
83
White and
tall
(W-ss)
65
Red and
tall
(wwss)
30
Total
384
E
O-E
(O-E)2
(O-E)2
E
Great! We are ready to start. First determine the "expecteds". It might be easier to do
the "1" part of the ratio first and work up the table. Regardless, take your time and
calculate what the expected numbers should be and fill in the "E" column.
Phenotypes
O
E
O-E
(O-E)2
(O-E)2
E
White and
short
(W-S-)
206
24 X 9
216
Red and
short
(wwS-)
83
24 X 3
72
White and
tall
(W-ss)
65
24 X 3
72
Red and
tall
(wwss)
30
24 X 1
24
Total
384
384
I hope you were able to work through that and get these numbers too. Did you check
your math by adding up the column to make sure the E column equals the C column?
Now it is time to fill in the rest of the table and calculate the chi-square.
Go ahead and complete the calculations before paging down.
Phenotypes
O
E
O-E
(O-E)2
(O-E)2
E
White and
short
(W-S-)
206
24 X 9
216
206 - 216
10
102
100
100 / 216
0.463
Red and
short
(wwS-)
83
24 X 3
72
83 - 72
11
112
121
121 / 72
1.681
White and
65
24 X 3
65 - 72
-72
49 / 72
tall
(W-ss)
72
-7
49
0.681
30 - 24
6
62
36
36 / 24
1.500
Red and
tall
(wwss)
30
24 X 1
24
Total
384
384
4.325
Did you get 4.325 for the answer?
If you didn't, look over my answer and figure out where you went wrong - and try to
learn from your error so you can do it right next time. [A common mistake occurs in
the last column - many students divide by either the observed or by some other
expected number. Remember to always divide by the expected number for that
category.]
OK, you have calculated the chi-square and
it is now time to do something with it.
Here's a portion of the Chi Square
Significance Table.
How many "classes" (categories, groups) are
in this experiment?
Degrees of
Freedom
5 % Significance
Levels
1
3.84
2
5.99
3
7.81
4
9.49
Four (Red and tall, White and tall, Red and short, White and short)
Some students get through the difficult chi-square but then make a simple mistake at
this point. Some get confused and pick a number out of the ratio and say there at nine
classes! Or three. Or some other number and I cannot figure out where it came from.
So, just to keep yourself thinking clearly, it is smart to list the categories.
Now, how many degrees of freedom are in this experiment?
Degrees of
Freedom
5 % Significance
Levels
Three (4 -1 )
1
3.84
Does the 9 : 3 : 3 : 1 ratio fit the
data?
2
5.99
3
7.81
4
9.49
Yes! With three degrees of freedom you can have a chi-square as large as 7.81 before
we would be beyond our 5% significance.
Notice that if you had been so foolish as to stick with the one degree of freedom (that
we were using with the monohybrid crosses) you would have decided that the chisquare was too large and would have (WRONGLY) rejected the ratio!
The chi-square can be used whenever there is an
expected ratio
What is the expected ratio of boys to girls?
1:1
What is the degrees of freedom in that example?
There are two categories (classes) so there is one degree of freedom.
There are in vitro fertilization (IVF) methods that can increase the chances that a girl
will be born or a boy will be born. You can use the chi-square to determine if a
particular IVF clinic is really increasing the chances of having a boy or girl. You
could look at the number of girls and boys born to women who wanted girls or boys
and calculate the chi-square.
If a particular IVF clinic can, indeed, increase the odds, would you expect the chisquare to be above or below the value of 3.84 (which I got from the table above)?
If the IVF clinic can change the ratio from the expected 1 : 1 then the chi-square,
calculated on the number of daughters or sons born, would be greater than 3.84.
I hope you understand that here we are "hoping" that the ratio will NOT be 1 : 1. (In
point of fact, scientists aren't supposed to "hope" for results but the fact remains that
they often hope a lot! )
Let's consider another situation.
You are the district manager of three fast food restaurants and you are looking over
the revenues. You see that store A made $1,000,000, store B made $3,000,000 and
store C brought in $5,000,000. You wonder if that is just a statistically blip. How
would you use the chi-square to test the idea that these stores are different - beyond
luck? (Don't do the chi-square - just tell me how you would set it up.)
You would "expect" a 1 : 1 : 1 ratio in the revenues if they were all the same. In other
words, the total revenues of $9,000,000 would be distributed evenly. You would
expect ...
Store A = $3,000,000
Store B = $3,000,000
Store C = $3,000,000
You could now find, for each store, the difference between expected and observed
revenues, square the difference, divide that by the expected and then add all three
together to get a chi-square value.
Suppose the manager of store A complains that you are not being fair because you
haven't taken into account the differences in local population around each store. His
store serves a smaller community. So, you go to the population records and discover
that store A serves a population that is only a quarter the size of the communities
served by stores B and C. Can you redo the chi-square? How?
The information about the populations tells you that there are four times as many
likely customers for stores B and C as A. You can express that as a ratio of 1 : 4 : 4. If
revenues are dependent upon population you would expect ("expect" is the magic
word that means "here comes a chi-square")
Store A = $1,000,000
Store B = $4,000,000
Store C = $4,000,000
The observed revenues were
Store A = $1,000,000
Store B = $3,000,000
Store C = $5,000,000
Now you would do another chi-square to determine if these numbers fit a 1 : 4 : 4
ratio (thus showing that revenues are probably dependent upon population).
And finally, what is the degree of freedom for this-three store problem?
There are three categories (Stores, A, B and C) so there are two degrees of freedom.
These last few puzzles, about sex ratios and revenue ratios, are to show you that the
chi-square has many uses and that all you have to do is identify how to think about the
ratios, expectations and outcomes.
If you haven't done so already, pick up a copy of the answers to The Chi-Square and
compare it to your own Worksheet. Make sure you understand it
Download