Pop Gen1 Breeding Bunnies No answers

advertisement
TOPIC: Population Genetics I: Natural Selection and Allele Frequencies
TUTOR GUIDE
MODULE CONTENT: This module contains simple exercises for biology majors
to begin to explore a central tenet of evolution, the influence of natural selection
on allele frequencies. It also allows students to use calculations of expected and
observed allele frequencies to determine if populations are in Hardy-Weinberg
Equilibrium.
TABLE OF CONTENTS
Alignment to HHMI Competencies for Entering Medical Students………………...1
Outline of concepts covered, module activities, and implementation...……..........2
Module: Worksheet for completion in class........................................................3-8
Pre-laboratory Exercises (mandatory)..............................................................9-13
Suggested Questions for Assessment.................................................................14
Guidelines for Implementation……………………………...............…...................15
Contact Information for Module Developers........................................................16
Alignment to HHMI Competencies for Entering Medical Students:
Competency
E1. Apply quantitative reasoning
and appropriate mathematics to
describe or explain phenomena
in the natural world.
E8. Demonstrate an
understanding of how the
organizing principle of
evolution by natural selection
explains the diversity of life on
earth.
Learning Objective
E1.1. Demonstrate quantitative numeracy
and facility with the language of
mathematics.
Activity
1,5,6
E1.2. Interpret data sets and communicate
those interpretations using visual and other
appropriate tools.
E1.3. Make statistical inferences from data
sets
E8.2 Explain how evolutionary mechanisms
contribute to change in gene
frequencies in populations
1,2,3,4
1
5,6
2,3,4,6
2
Mathematical/Statistical Concepts covered:
- probability
- chi-square test
In class activities:
- calculating allele frequencies, Hardy-Weinberg Equilibrium
- graphing
- using chi-square test
Components of module:
- preparatory assignment to complete and turn in as homework before class
- in class worksheet
- suggested assessment questions
- guidelines for implementation
Estimated time to complete in class worksheet
- 60 minutes
Targeted students:
- first year-biology majors in introductory biology course covering evolution
Quantitative Skills Required:
- Basic arithmetic
- Logical reasoning
- Interpreting data from tables
- Graph/Data Interpretation
3
WORKSHEET
Objective: Biological evolution is a change in the frequency of alleles in a
population over a period of time. The type of evolutionary mechanisms operating,
and the initial frequency of an allele in the population, can influence the amount
of change in allele frequencies.
In this activity, you will examine natural selection in a small population of wild
rabbits. Breeders of rabbits have long been familiar with a variety of genetic traits
that affect the survival of rabbits in the wild, as well as in breeding populations.
One such trait is the trait for furless rabbits (naked bunnies). This trait was first
discovered in England by W.E. Castle in 1933. The furless rabbit is rarely found
in the wild because the cold English winters are a strong selective force against
it.
Note: In this activity, the dominant allele for normal fur is represented by F and
the recessive allele for no fur is represented by f. Bunnies that inherit two F
alleles or one F and one f allele have fur, while bunnies that inherit two f alleles
have no fur. In this worksheet you will review some important principles and
terms applicable to genetics. This worksheet will ask questions about both
Mendelian and non-Mendelian genetics (e.g. – linkage). You will also utilize the
Chi-square statistical test to test for deviations from Hardy Weinberg equilibrium.
Materials per group
 1 bag of beans (Red and White)
 4 cups
Procedure:
1. The red beans represent the allele for fur (F), and the white beans represent
the allele for no fur (f). The bag represents the habitat where the rabbits live,
and randomly mate.
2. Label one cup FF for the homozygous dominant genotype. Label a second
cup Ff for the heterozygous condition. Label the third cup ff for those rabbits
with the homozygous recessive genotype.
3. Use the beans (alleles) in your bag (habitat) to count and record your starting
allele frequencies. (Note: don’t trust the numbers written on the bags. Count
them yourselves.) Record the number of red beans (F) in the “Generation 0”
row in the column labeled "Number of F Alleles;" white beans in the column
"Number of f Alleles." Calculate the frequency of each allele by dividing the
number of each allele by the total number of alleles. Then, place all beans
(alleles) back in the bag and shake up (mate) the rabbits.
4
4. Without looking at the beans, select two at a time, and record the results on
the data table next to "Generation 1." For instance, if you draw one red and
one white bean, place a mark in the chart under "Number of Ff individuals."
Continue drawing pairs of beans and recording the results in your chart until
all beans have been selected and sorted. Place the diploid "rabbits" into the
appropriate cup: FF, Ff, or ff. (Remember that the total number of individuals
will be half the total number of beans because each rabbit requires two
alleles.)
5. The ff bunnies are born furless. They cannot survive to adulthood, so they die
before they get a chance to mate. Therefore, take all the beans out of the ff
cup and place them aside (but remember where you put them! All beans must
be returned to the TA to receive a grade for the lab).
6. Now count the F and f alleles (beans) that remain in each of the surviving
bunny cups (FF and Ff) and (i.e., you are counting the alleles of the surviving
bunnies. These alleles will contribute to the next generation.) Enter the
number of each allele and their frequencies in the right side of the table.
7. Place the alleles of the surviving rabbits (which have grown, survived and
reached reproductive age) back into the bag and mate them again to get the
next generation.
8. Repeat steps four through seven to obtain generations two through five.
Gen.
Parents
Number of Number of
FF
Ff
individuals individuals
Alleles contributing to next generation
Number of Number
Number
Frequen Frequen
ff
of F
of f
cy of F
cy of f
individual alleles
alleles
s
0
1
2
3
4
5
5
Discussion questions:
1. Prepare a graph with allele frequency as a function of time (generation
number) and graph your allele frequencies. (Allele frequency is plotted along
the y-axis and time [generation] is plotted on the x-axis.) Frequency should be
represented in decimals. Plot all frequencies on one graph. Use a solid line
for F and a dashed line for f. Draw the graph of the change in F on the whiteboard for your table. Clearly label the axes and note the starting
frequency of F (in Generation 0) for your group.
Compare the frequencies of the alleles in the 5th generation to the starting
generation. Would you say this particular bunny population has evolved (i.e.
allele frequency has changed more than 1%) over 5 generations? Why or why
not?
2. Look at the graphs of the frequency of F over time for other groups that had
the same starting allele frequencies as you. In generation 5, are the allele
frequencies the same across groups? Explain why this might be.
3. Now compare the rates of change in the frequency of F across groups with
different starting allele frequencies. During the first 3 generations, does F
change at the same rate for each group? Describe what you see and explain
why this might be.
6
The Hardy-Weinberg (HW) Principle
Evolution is the change in allele frequency over time. In the absence of evolution,
allele frequencies would stay the same in a population from generation to
generation. The Hardy-Weinberg principle (HW) allows us to predict the
genotype frequencies in the next generation of a population from the allele
frequencies, provided that evolution is not occurring. In this way, HW provides a
null hypothesis for evolution. If a population’s genotypic frequencies can be
predicted by HW, then it is in HW equilibrium and evolution is not occurring. If,
however, a population is not in HW equilibrium then evolution is occurring.
HW principle:



p = frequency of dominant allele (you have already calculated these
values)
q = frequency of recessive allele (you have already calculated these
values)


p2 = predicted genotype frequency of homozygous dominant individuals
(ex-AA)
2pq = predicted genotype frequency of heterozygous individuals (ex-Aa)
q2 = predicted genotype frequency of homozygous recessive individuals
(ex-aa)

p2 + 2pq + q2 = 1
4. Calculate the expected (predicted) Hardy-Weinberg genotypic frequencies in
the starting generation for all three possible genotypes. (p = frequency of
allele F, q = frequency of allele f. Hardy-Weinberg expected genotypes are
FF = p2 Ff = 2pq ff = q2).
5. Chi-square (χ2) test:
a) Now calculate the observed genotype frequencies in the last generation
below (i.e., what percentage of diploid individuals were FF, Ff, and ff?)
7
b) Now you need to compare the observed genotype frequencies to the
expected HW frequencies we will use a Chi-square test. Chi-square provides
a statistical test for the hypothesis that the observed genotype frequencies
are different than the expected HW genotype frequencies. First, use the
expected frequencies you calculated in question 4 and the observed number
of bunnies you had in generation 5 to figure out how many bunnies of each
genotype you expected see to see in generation 5.
Expected # of FF bunnies = p2 x (total # of adult bunnies in
generation 5)
Expected # of Ff bunnies = 2pq x (total # of adult bunnies in
generation 5)
Expected # of ff bunnies = q2 x (total # of adult bunnies in
generation 5)
For Chi-square analysis, you will need to enter count data in the table below. For
example, if you found an expected allele frequency of 0.60 or 60% for allele F,
and you had a total of 50 adult bunnies in generation 5, your expected number of
FF bunnies in generation 5 would be 18, because 0.602 x 50 = 18. Do this for
each row of the “Expected” column. Next, fill out the “Observed” column with the
actual genotype counts you saw in that final generation of bunnies – the actual
numbers of FF, Ff and ff bunnies that were produced and survived to adulthood
in generation 5. Note that the sum of your “Expected” column should be the
same as the sum of your “Observed” column – if it isn’t, something is probably
wrong with how you filled in your “Expected” column. Use these calculated count
data to solve the equation (O-E)2 /E for each row, and sum those values to
produce your Chi-square value.
Observed (O)
Expected (E)
(O-E)2/E
FF
Ff
ff
Χ2=
8
c) Using the Chi-square table below, with 2 degrees of freedom (note- think
about why you are using 2 degrees of freedom), what is the probability that
you would observe the above outcome by random chance?
Degrees of
Freedom
1
2
3
4
.99
.80
.50
.20
.10
.05
.01
.001
.00016
.20
.12
.30
.064
.45
1.0
1.6
.46
1.4
2.4
3.4
1.6
3.2
4.6
6.0
2.7
4.6
6.3
7.8
3.8
6.0
7.8
9.5
6.6
9.2
11.3
13.3
10.8
13.8
16.3
18.5
d) Remember that we learned last time that results are significant if the
probability (p-value) is less than 0.05. If the null hypothesis is that there was
no effect of selection on genotype frequencies, do you accept or reject your
null hypothesis? In other words, did natural selection significantly alter
genotype frequencies in this scenario? Explain.
MODULE FEEDBACK - Each year we work to improve the modules in the active
learning "discussion" sections. Please answer the following question with regard
to this module on this sheet and turn in your answer to the TA. You can do this
anonymously if you like by turning in this sheet separately from your module
answers.
How helpful was this module in helping you understand a fundamental
concept in population genetics?
A = Extremely helpful
B= Very helpful
C= Moderately helpful
D= A little bit helpful
E = Not helpful at all
Module Rating ____________
Thank you!
9
Pre-module Exercise : Population Genetics
Parts I and II of this assignment should be completed and turned in at the
beginning of class on the day the module is implemented.
The following questions are a review of basic probability. Most elementary texts
in mathematics will contain this material.
Part I. Probability Review
Here, we provide a brief review of how to calculate a probability in a simple
system, such as a gumball machine. Consider a bubblegum machine with four
colors of bubblegum balls: red, green, blue, and yellow.
We begin with some definitions:
Sample space: all possible outcomes of (in this case) choosing a gumball from
the gumball machine
Experiment: action whose possible outcomes can be recorded, such as
choosing a gumball from a gumball machine
Outcome: one possible output of an experiment, such as choosing a red
gumball from the gumball machine
Event: a subset of the sample space
With these definitions in mind, try the following problems
1. What is the sample space of choosing a gumball from the gumball machine if
we know that there are 4 red gumballs, 3 green gumballs, 2 blue gumballs and 3
yellow gumballs?
2. What is the sample space of choosing two gumballs at a time in the above
scenario?
10
3. An example of an event of the experiment is choosing a red gumball. The
probability of choosing a red gumball is the number of red gumballs divided by
the total number of gumballs. For the sample space described in problem 1:
a. What is the probability of randomly choosing a red gumball (give your
answer to the nearest hundredth decimal place)?
b. What is the probability of randomly drawing a yellow
gumball?
4. The probability of drawing a gumball of a given color at random from this
population of gumballs is equal to the frequency of the ball color in the
population.
What is the probability of green colored balls in this group of balls?
5. Suppose the red and blue gumballs are removed, and there are only green
and yellow gumballs remaining in the gumball machine;
a. What is the new sample space?
b. What is the probability of randomly choosing a blue gumball?
c. What is the frequency of yellow gumballs in the new population of
gumballs?
6. Assuming that 25% of the gumballs in a machine are red and that this
machine dispenses two gumballs for a quarter. What is the probability that the
two gumballs the machine gives you when you put your quarter in will both be
red?
Part II: Allele Frequencies and Hardy Weinberg Equilibrium
The Hardy-Weinberg Principle was developed to describe the genetic
characteristics of populations with no evolutionary forces acting on it (mating is
random, there is no selection or migration, the population size is infinite and no
new mutations occur). As a result, for populations in Hardy-Weinberg equilibrium
the frequency of alleles and genotypes does not change over time (across
generations). This also means that for any locus with two alleles, A and a, if we
know their frequencies in the population (freq(A) = p, freq(a) = q) the diploid
genotype frequencies can be predicted by the allele frequencies as follows:
freq(AA) = p2 for the AA homozygotes in the population, freq(aa) = q2 for the aa
homozygotes, and freq(Aa) = 2pq for the heterozygotes. So, consider a gene
with two alleles A and a, with frequency p = 0.7 and q = 0.3, respectively.
7. Using this information answer the following questions
a. Assume the population we are studying is a species of plant and we
wish to determine allele frequencies. If we randomly sampled 100
individuals from the population, how many would we expect to have the A
allele?
11
b. Fill in the following table of expected genotype frequencies in this
population:
Genotypes
AA
Aa
aa
Expected
Frequency
8. Now assume that we actually obtain the diploid genotype from 100 members
of the population and find them in the following numbers:
AA:
20
Aa:
70
aa:
10
a. What are the observed diploid genotype frequencies?
AA:
Aa:
aa:
b. Do you think the observed genotype frequencies deviate significantly from
those expected under Hardy-Weinberg? If so, what might be the cause? Explain
your answer.
9. To determine if the observed genotype frequencies are statistically significantly
different we can use a goodness of fit (chi‐ square) test. Chi-square analyses
are used to test for differences between expected and observed values and are
used when the observed data fall into categories or classes (e.g., counts of
events in categories like the number of males wearing blue vs. red shirts, not
continuous measurements like weight or height - such measurements are
analyzed using different statistical procedures like t-tests).
The formula for the Chi-square test statistic is:
12
Χ2 = i (Observedi-Expectedi)2 / (Expectedi)
Where Observedi is the count of individuals in a particular category i (e.g., the
number of boys wearing red shirts) and Expectedi is the expected count of
individuals in category i. The numerator is squared so that negative and positive
values contribute equally to the test statistic. To get the chi-square value, add up
all of the resulting values for each category i.
To initiate the analysis of data using this test enter the number of AA, Aa, and aa
individuals observed in your sample from the population in the table below. Next
calculate the number of expected AA, Aa, and aa individuals in a theoretical
sample of the same size (100 individuals) if the population was in HWE. To do
this, multiply 100 x the expected frequency for each genotype under HWE using
the values you entered in the table for 6b.
A) Fill out the following table using data from questions 6 and 7 above and then
complete the equation below to calculate the chi - squared statistic (Χ2):
Observed (O)
(O-E)2/E
Expected (E)
FF
Ff
ff
Χ2=
B) Next calculate the degrees of freedom for this data set. The equation for
calculating degrees of freedom for a chi-square goodness of fit test is:
d.f = k - 1 - m
where k is the number of categories and m is the number of independent
parameters that we needed to use to calculate the expected number of
individuals in the different categories1. In this case our independent parameter
was one of the observed allele frequencies (one of which is independent
because if we know p then we must know q because both must add up to one so only one allele frequency is independent). So, for our data the there are three
categories (genotypes) minus 1 (because if we know two of the expected
genotype frequencies we already know the third one as all frequencies must sum
13
to one) minus 1 (because we used one independent allele frequency to calculate
our expected genotype frequencies)
d.f. = 3 - 1 - 1
C) If the chi-square value we calculate from our data exceed a critical chi-square
value with a certain degree of freedom at a probability of P <0.05, then we can
reject the null hypothesis. If you want to read more about where these P values
come from see any introductory statistics textbook or look online at a site you
trust (http://en.wikipedia.org/wiki/P-value). Look at the table of critical chi-square
values below and compare your calculated chi-square value to the critical value
given for P = 0.05 with 1 d.f. If the null hypothesis is that this locus is in HardyWeinberg equilibrium, do you accept or reject your null hypothesis? Explain why
you reached this conclusion compare this to the answer you gave for 7b.
df
Χ2, P = 0.05
1
3.84
2
5.99
3
7.82
4
9.49
14
Assessment Question
In recent study, by Valenzano et al. 20092, regions of the genome were mapped
to identify candidate genes that influenced male tail coloration in the African fish
Nothobranchius furzeri. Females of many species of fish are known to select
mates based on tail coloration. Assume that a future study used this information
to identify one of several genes that influenced one characteristic of the tail color
of males. There are two alleles found in the natural population, A and a.
Assume that the freq(A) = p, and freq(a) = q. Using this information, answer the
following questions.
1. Based on your knowledge of the Hardy Weinberg theory, which outcome
below is consistent with the interpretation that this gene is not under sexual
selection by females.
a.
b.
c.
d.
e.
p = 0.8, q = 0.2, freq(AA) = 0.1, freq(Aa) = 0.2, freq(aa) = 0.7
p = 0.7, q = 0.3, freq(AA) = 0.49, freq(Aa) = 0.19, freq(aa) = 0.32
p = 0.2, q = 0.8, freq(AA) = 0.7, freq(Aa) = 0.2, freq(aa) = 0.1
p = 0.90, q = 0.10, freq(AA) = 0.81, freq(Aa) = 0.18, freq(aa) = 0.01
p = 0.5, q = 0.5, freq(AA) = 0.90, freq(Aa) = 0.05, freq(aa) = 0.05
2. If the genotype frequencies in the population are the following, freq(AA) =
0.49, freq(Aa) = 0.42, freq(aa) = 0.09, what are the allele frequencies?
a.
b.
c.
d.
e.
p = 0.52, q = 0.48
p = 0.48, q = 0.52
p = 0.32, q = 0.68
p = 0.90, q = 0.8
p = 0.70, q = 0.3
15
Guidelines for Implementation
Guidelines for Implementation:
Collect homework. Have students break up into groups, ideally of 3-4 students
each.
1. Provide each group with a bag of beans (e.g., light and dark kidney beans or
light and dark M&Ms - make sure there are only two colors) and plastic cups
(e.g., red solo cups) and make sure there are sharpies or other types of
markers available to label the cups. Each bag should contain at least 50 beans
in total, but each bag should have different starting allele frequencies.
2. Have students work through the modules. As the students work, circulate and
assist them (but without giving them answers). When all the groups finish
question 4, work should stop to allow each group to present their results to the
class. It would be good to have all students prepare an overhead, make an excel
graph for projection or construct some other visual aid so that each group can
present their results to the class (which tells them to compare their results to
those of other groups to see what the effect of differences in starting allele
frequency have on the process of evolution by natural selection).
3. If you are going to follow this module with the related module incorporating the
effects of migration, make sure that each student copies the data their group
obtained from the selection experiment into the table provided on the extra sheet.
They will need to bring this back to the laboratory next week to complete next
week's assignment.
Literature cited:
1. Whitlock, M.C. and Schluter, D. 2009. The Analysis of Biological Data.
Roberts and Co. Publishers.
2. Valenzano D. R., Kirschner J., Kamber R. A., Zhang E., Weber D., et al. ,
2009. Mapping loci associated with tail color and sex determination in the shortlived fish Nothobranchius furzeri. Genetics 183: 1385–1395.
16
Module Developers:
Please contact us if you have comments/suggestions/corrections
Kathleen Hoffman
Department of Mathematics and Statistics
University of Maryland Baltimore County
khoffman@math.umbc.edu
Jeff Leips
Department of Biological Sciences
University of Maryland Baltimore County
leips@umbc.edu
Sarah Leupen
Department of Biological Sciences
University of Maryland Baltimore County
leupen@umbc.edu
Acknowledgments:
This module was developed as part of the National Experiment in Undergraduate
Science Education (NEXUS) through Grant No. 52007126 to the University of
Maryland, Baltimore County (UMBC) from the Howard Hughes Medical Institute.
17
Download