BIOL 102, CONCEPTS II NAME: _____________________________________ Breeding Bunnies Pre-Lab: Allele Frequencies and Hardy-Weinberg Equilibrium The Hardy-Weinberg Principle was developed to describe the genetic characteristics of a population with no evolutionary forces acting on it (mating is random, there is no natural selection, no gene flow, no mutations, population size is extremely large). As a result, for populations in Hardy-Weinberg equilibrium, the frequency of alleles and genotypes does not change over time (across generations). This also means that for any locus with two alleles, A (dominant allele) and a (recessive allele), if we know their frequencies in the population (freq(A) = p, freq(a) = q), the diploid genotype frequencies can be predicted from the allele frequencies as follows: freq(AA) = p2 the AA homozygotes in the population freq(aa) = q2 the aa homozygotes in the population freq(Aa) = 2pq the heterozygotes in the population Consider a gene with two alleles, A and a, with frequencies p = 0.7 and q = 0.3, respectively. 1. Using this information answer the following questions: a. Fill in the following table of expected genotype frequencies in this population: Table 1 Genotypes AA Aa aa Expected Frequency b. Assume the population we are studying is a species of plant and we wish to determine allele frequencies. If we randomly sampled 100 individuals from the population, how many would we expect to have the A allele? 2. Now assume that we actually obtain the genotype for 100 members of the population and find the following numbers: AA: 20 Aa: 70 aa: 10 a. What are the observed genotype frequencies? AA: Aa: aa: b. Do you think the observed genotype numbers deviate significantly from those expected under HardyWeinberg? If so, what might be the cause? Explain your answer. To determine if the observed genotype numbers we counted are statistically significantly different from those expected we can use a goodness of fit (chi‐square) test. Chi-square analyses are used to test for differences between expected and observed values when the observed data fall into categories or classes (e.g., counts of events in categories like the number of males wearing blue vs. red shirts, not continuous measurements like weight or height - such measurements are analyzed using different statistical procedures like t-tests). The formula for the Chi-square test statistic is: Χ2 = i (Observedi-Expectedi)2 / (Expectedi) Where Observedi is the count of individuals in a particular category i (e.g., the number of boys wearing red shirts) and Expectedi is the expected count of individuals in category i. The numerator is squared so that negative and positive values contribute equally to the test statistic. To get the chi-square value, add up all of the resulting values for each category i. To initiate the analysis of data using this test, enter the number of AA, Aa, and aa individuals observed in your sample from the population in the table below. Next calculate the number of expected AA, Aa, and aa individuals in a theoretical sample of the same size (100 individuals) if the population was in HardyWeinberg Equilibrium. To do this, multiply 100 x the expected frequency for each genotype under HWE using the values you entered in Table 1 for question 1a. 3. Fill out the following table using data from questions 1 and 2 above (use the count, not the frequency), then calculate the chi - squared statistic (Χ2): Table 2 Observed (O) (O-E)2/E Expected (E) AA Aa aa Χ2= Next calculate the degrees of freedom for this data set. The equation for calculating degrees of freedom for a chi-square goodness of fit test is: d.f. = k - 1 – m where k is the number of categories and m is the number of independent parameters that we needed to use to calculate the expected number of individuals in the different categories1. In this case our independent parameter was one of the observed allele frequencies (one of which is independent because if we know p then we must know q because both must add up to one - so only one allele frequency is independent). So, for our data the there are three categories (genotypes) minus 1 (because if we know two of the expected genotype frequencies we already know the third one as all frequencies must sum to one) minus 1 (because we used one independent allele frequency to calculate our expected genotype frequencies) d.f. = 3 - 1 – 1 = 1 If the chi-square value we calculate from our data exceed a critical chi-square value with a certain degree of freedom at a probability of P <0.05, then we can reject the null hypothesis. If you want to read more about where these P values come from see any introductory statistics textbook or look online at a site you trust (http://en.wikipedia.org/wiki/P-value). Look at the table of critical chi-square values below and compare your calculated chi-square value (X2 in Table 2 above) to the critical value given for P = 0.05 with 1 d.f. below: df Χ2, P = 0.05 1 3.84 2 5.99 3 7.82 4 9.49 4. If the null hypothesis is that this locus is in Hardy-Weinberg equilibrium, do you accept or reject your null hypothesis? 5. Explain why you reached this conclusion. Compare this to the answer you gave for 2b.