Uploaded by Imagine Rouse

Breeding Bunnies PreLab 2020 (1)

advertisement
BIOL 102, CONCEPTS II
NAME: _____________________________________
Breeding Bunnies Pre-Lab: Allele Frequencies and Hardy-Weinberg Equilibrium
The Hardy-Weinberg Principle was developed to describe the genetic characteristics of a population
with no evolutionary forces acting on it (mating is random, there is no natural selection, no gene flow,
no mutations, population size is extremely large). As a result, for populations in Hardy-Weinberg
equilibrium, the frequency of alleles and genotypes does not change over time (across generations).
This also means that for any locus with two alleles, A (dominant allele) and a (recessive allele), if we
know their frequencies in the population (freq(A) = p, freq(a) = q), the diploid genotype frequencies can
be predicted from the allele frequencies as follows:
freq(AA) = p2 the AA homozygotes in the population
freq(aa) = q2 the aa homozygotes in the population
freq(Aa) = 2pq the heterozygotes in the population
Consider a gene with two alleles, A and a, with frequencies p = 0.7 and q = 0.3, respectively.
1. Using this information answer the following questions:
a. Fill in the following table of expected genotype frequencies in this population:
Table 1
Genotypes
AA
Aa
aa
Expected Frequency
b. Assume the population we are studying is a species of plant and we wish to determine allele
frequencies. If we randomly sampled 100 individuals from the population, how many would we
expect to have the A allele?
2. Now assume that we actually obtain the genotype for 100 members of the population and find the
following numbers:
AA:
20
Aa:
70
aa:
10
a. What are the observed genotype frequencies?
AA:
Aa:
aa:
b. Do you think the observed genotype numbers deviate significantly from those expected under HardyWeinberg? If so, what might be the cause? Explain your answer.
To determine if the observed genotype numbers we counted are statistically significantly different from
those expected we can use a goodness of fit (chi‐square) test. Chi-square analyses are used to test for
differences between expected and observed values when the observed data fall into categories or
classes (e.g., counts of events in categories like the number of males wearing blue vs. red shirts, not
continuous measurements like weight or height - such measurements are analyzed using different
statistical procedures like t-tests).
The formula for the Chi-square test statistic is:
Χ2 = i (Observedi-Expectedi)2 / (Expectedi)
Where Observedi is the count of individuals in a particular category i (e.g., the number of boys wearing
red shirts) and Expectedi is the expected count of individuals in category i. The numerator is squared so
that negative and positive values contribute equally to the test statistic. To get the chi-square value, add
up all of the resulting values for each category i.
To initiate the analysis of data using this test, enter the number of AA, Aa, and aa individuals observed in
your sample from the population in the table below. Next calculate the number of expected AA, Aa, and
aa individuals in a theoretical sample of the same size (100 individuals) if the population was in HardyWeinberg Equilibrium. To do this, multiply 100 x the expected frequency for each genotype under HWE
using the values you entered in Table 1 for question 1a.
3. Fill out the following table using data from questions 1 and 2 above (use the count, not the
frequency), then calculate the chi - squared statistic (Χ2):
Table 2
Observed (O)
(O-E)2/E
Expected (E)
AA
Aa
aa
Χ2=
Next calculate the degrees of freedom for this data set. The equation for calculating degrees of freedom
for a chi-square goodness of fit test is:
d.f. = k - 1 – m
where k is the number of categories and m is the number of independent parameters that we needed to
use to calculate the expected number of individuals in the different categories1. In this case our
independent parameter was one of the observed allele frequencies (one of which is independent
because if we know p then we must know q because both must add up to one - so only one allele
frequency is independent). So, for our data the there are three categories (genotypes) minus 1 (because
if we know two of the expected genotype frequencies we already know the third one as all frequencies
must sum to one) minus 1 (because we used one independent allele frequency to calculate our expected
genotype frequencies)
d.f. = 3 - 1 – 1 = 1
If the chi-square value we calculate from our data exceed a critical chi-square value with a certain
degree of freedom at a probability of P <0.05, then we can reject the null hypothesis. If you want to
read more about where these P values come from see any introductory statistics textbook or look online
at a site you trust (http://en.wikipedia.org/wiki/P-value). Look at the table of critical chi-square values
below and compare your calculated chi-square value (X2 in Table 2 above) to the critical value given for P
= 0.05 with 1 d.f. below:
df
Χ2, P = 0.05
1
3.84
2
5.99
3
7.82
4
9.49
4. If the null hypothesis is that this locus is in Hardy-Weinberg equilibrium, do you accept or reject your
null hypothesis?
5. Explain why you reached this conclusion. Compare this to the answer you gave for 2b.
Download