Section 3 Characterizing Genetic Diversity: Single Loci Gene with 2 alleles designated “A” and “a”. Three genotypes: AA, Aa, aa Population of 100 individuals with the following Genotypes: AA = 50, Aa = 30, aa = 20 Genotypic frequencies -- General formula: f(AA) = NAA/N -- > 50/100 = 0.5 f(Aa) = NAa/N -- > 30/100 = 0.3 f(aa) = Naa/N -- > 20/100 = 0.2 Allele Frequencies: AA = 50, Aa = 30, aa = 20 Note, every individual carries two copies of the gene thus, the total number of alleles is 2N. p = frequency of “A” and q = frequency of “a”. The frequency of “A” is: p = (50 + 50 + 30)/200 = 0.65 Frequency of “a” is: q = (20 + 20 + 30)/200 = 0.35 Note: p + q = 1 therefore, an equivalent formula is: p = f(AA) + 0.5f(Aa) and q = 0.5f(Aa) + f(aa) Hardy-Weinberg Equilibrium: under certain conditions, allele and genotypic frequencies will remain constant in a population from one generation to the next. Assumptions of Hardy-Weinberg Equilibrium: 1. Organism in question is diploid 2. Reproduction is sexual 3. Generations are non-overlapping 4. Panmixia 5. Population size is infinitely large, or at least large enough to avoid stochastic errors 6. Migration (immigration/emigration) is negligible 7. No mutation 8. Natural selection does NOT affect the gene under consideration Hardy-Weinberg equilibrium is simple but provides the basis for detecting deviations from random mating, testing for selection, modeling the effects of inbreeding and selection, and estimating allele frequencies. Single autosomal locus in a diploid organism with discrete generations. Initially consider a locus with only two alleles “A” and “a” with initial frequencies “p” and “q”. Designate frequencies of genotypes AA, Aa, and aa as P, H, and Q, respectively. Random Union of Gametes: Many marine invertebrates release their gametes into the sea and the gametes find one another and combine at random. Sperm Allele Frequency A p E G G a q A p a q AA p2 Aa pq Aa pq aa q2 Note: p2 + 2pq + q2 = (p + q)2 = 1 Testing for deviations from H.W.E H.W.E serves as a null hypothesis and tells us what to expect if nothing interesting is happening. If we sample a population and find that the predictions of H.W.E are not met, then we can conclude that one or more of the assumptions is violated. Chi-square test of “Goodness of Fit” 2 = (observed - expected)2/expected Example: You are studying a population of African elephants and assay the entire population (N = 260) for the ADH locus and find that the population contains only two alleles (F and f) with the following genotypic counts: FF = 65, Ff = 125, ff = 70 Step 1: Determine allele frequencies: p = F = (65 + 65 + 125)/520 = 0.4904 q = f = 1 - p = 1 - 0.4904 = 0.5096 Step 2: Calculate Expected genotypic freq.: P = p2 = (0.4904)2 H = 2pq = 2(0.4904)(0.5096) Q = q2 = (0.5096)2 = = = 0.2405 0.4998 0.2597 Step 3: Calculate chi-square statistic: P H Q O E 65 0.2405 X 260 = 62.53 125 0.4998 X 260 = 129.95 70 0.2597 X 260 = 67.52 2 = (O-E)2/E 0.098 0.189 0.091 0.378 Step 4: Compare calculated 2 with tabled 2: Degrees of freedom 3(# of genotypes) - 1(constant) - 1(# parameters) =1 Look up critical values for 2 statistic: D.f. 1 2 3 Level of Significance 0.05 0.01 0.001 3.84 6.64 10.83 5.99 9.21 13.82 7.82 11.34 16.27 Calculated 2 (0.378) is less than tabled value therefore we fail to reject the null hypothesis. Cautionary notes about testing for deviations from H.W.E: Caution 1: If we find a population does not deviate from Hardy-Weinberg Equilibrium, we cannot conclude that no evolutionary forces are operating. Caution 2: The ability of the chi-square test to detect significant deviations from Hardy-Weinberg equilibriums is very weak. Caution 3: Deviations from Hardy-Weinberg expectations gives us not information about the kinds or directions of the evolutionary forces operating. Deviations from H.W.E There are two types of non-random mating, those Where mate choice is based on ancestry (inbreeding and crossbreeding) and those whose Choice is based upon genotypes at a particular Locus (assortative and disassortative mating). Inbreeding: Is of major importance in conservation genetics as it leads to reduced reproductive fitness. When related individuals mate at a rate greater then expected by random mating, the frequency of heterozygotes is reduced relative to H.W.E. Avoidance of inbreeding and cross-breeding can lead to higher than expected heterozygosities. Assortative and Disassortative Mating: the preferential mating of like-with-like genotype is called “assortative” mating. The mating of unlike genotypes is referred to as “disassortative” mating. In general, assortative mating leads to increased homozygosity, while disassortative mating increases heterozygosity, relative to H.W. expectations. Fragmented populations: Allele frequencies diverge in isolated populations due to chance and selection. This results in an overall deficiency of heterozygotes, even when individual populations are themselves in H.W.E Linkage Disequilibrium: In large, randomly mating populations at equilibrium, alleles at different loci are expected to be randomly associated. Consider loci A and B with alleles A1, A2, and B1, B2, and frequencies pA, qA, pB, qB, respectively. These loci and alleles form gametes A1B1, A1B2, A2B1, and A2B2. Under random mating and independent assortment, These gametes will have frequencies that are the Product of their allele frequencies, A1B2 = pAqB. Random association of alleles at different loci is referred to as “Linkage Equilibrium”. Non-random association of alleles among loci is referred to as “Linkage Disequilibrium”. Chance events in small populations, population bottlenecks, recent mixing of different populations, and selection all may cause non-random associations among loci. Loci that show deviations from linkage equilibrium in large randomly mating populations are often subject to strong forces of natural selection. In small populations, neutral alleles that have no selective differences between genotypes may behave as if they are under selection due to non-random association with alleles at nearby loci that are being strongly selected. Linkage disequilibrium is of importance in populations of conservation concern as: Linkage disequilibrium will be common in threatened species as their population sizes are small. Population bottlenecks frequently cause linkage disequilibrium. Evolutionary processes are altered when there is linkage disequilibrium. Functionally important gene clusters exhibiting linkage disequilibrium (such as MHC) are of major importance to the persistence of threatened species. Linkage disequilibrium is one of the signals that can be used to detect admixture of differentiated populations. Linkage disequilibrium can be used to estimate genetically effective population sizes. Consider an example where two different monomorphic populations with genotypes A1A1B1B1 and A2A2B2B2 are combined and allowed to mate at random. Each autosomal locus is expected to attain individual H.W.E. in one generation. However, alleles at different loci do not attain linkage equilibrium frequencies in one generation, they only approach is asymptotically at a rate dependent on the recombination frequency between the two loci. In this example of the pooled population, assume: 70% of pooled population isA1A1B1B1 30% of pooled population is A2A2B2B2 equal number of females & males of both genotypes. Only two gametic types are produced: A1B1, A2B2 Next generation: A1A1B1B1, A1A2B1B2, A2A2B2B2 These loci are clearly in linkage disequilibrium. In subsequent generations, two other possible gametic types A1B2 and A2B1 are generated by recombination in the multiply heterozygous genotype. For example, A1B1//A2B2 heterozygotes produce recombinant gametes A1B2 and A2B1 at frequencies of 1/2c, where c is the rate of recombination and non-recombinant A1B1, A2B2 gametes in frequencies 0.5(1-c). Eventually, all 9 possible genotypes will be formed and attained at equilibrium frequencies. Until equilibrium is reached, genotypes will deviate from their expected frequencies. Linkage disequilibrium is the deviation of gametic frequencies from their equilibrium frequencies. The measure of linkage disequilibrium D is the difference between the product of the frequencies of the A1B1 and A2B2 gametes (referred to as r and u) and the product of the frequencies of the A1B2 and A2B1 gametes (s and t): D = ru - st Actual freq. Equil. freq. Disequilibrium: r s t pAqA pAqB qApB D = ru - st u qAqB 1.0 1.0 Numerical Example: pA = 0.70, qA = 0.30, pB = 0.70, qB = 0.30 Actual freq. Equil. freq. 0.70 0.00 0.00 0.30 0.7X0.7 0.7X0.3 0.3X0.7 0.3X0.3 0.49 0.21 0.21 0.09 Disequilibrium D = (0.7 X 0.3) - (0.0 X 0.0) = 0.21 Dmax = 0.25 and occurs when: r = 0.5, s = 0.0, t = 0.0, u = 0.5 Dmin = -0.25 and occurs when: r = 0.0, s = 0.5, t = 0.5, u = 0.0 Under equilibrium, ru = st and D = 0. Many different measures of disequilibrium. Lewontin (1964) suggested D’, which is: D’ = D / Dmax Where, Dmax is the maximum D possible for a given set of allele frequencies at the two loci. Dmax is equal either to the lesser of A1B2 (=s) or A2B1 (=t) if D is positive or to the lesser of A1B1 (=r) or A2B2 (=u) if D is negative. The advantage of this measure is that it ranges from -1.0 to 1.0, regardless of the allele frequencies at the two loci. A1B1 0.5 0.4 0.25 0.1 0.0 Gamete Freq. A1B2 A2B1 0.0 0.0 0.1 0.1 0.25 0.25 0.4 0.4 0.5 0.5 A2B2 0.5 0.4 0.25 0.1 0.0 D 0.25 0.15 0.0 -0.15 -0.25 D’ 1.0 0.6* 0.0 -0.6 -1.0 A1=B1=0.9 0.9 0.85 0.81 0.0 0.05 0.09 0.0 0.05 0.09 0.1 0.05 0.01 0.09 0.04 0.0 1.0 0.44 0.0 A1=B2=0.9 0.0 0.05 0.09 0.9 0.85 0.81 0.1 0.05 0.01 0.0 0.05 0.09 -0.09 -0.04 0.0 -1.0 -0.44* 0.0 0.1 0.05 0.0 0.05 0.4 0.45 0.5 0.45 0.05 0.0 1.0 0.0 Allele Freq. A1=B1=0.5 A1=0.1,B1=0.5 Example 1:A1=B1=0.5 Actual Gametic Freq: Equilib. Gametic Freq: A1B1 A1B2 A2B1 A2B2 0.4 0.1 0.1 0.4 0.25 0.25 0.25 0.25 D = (A1B1 X A2B2) - (A1B2 X A2B1) D = (0.4 X 0.4) - (0.1 X 0.1) = 0.16 - 0.01 = 0.15 D’ = D/Dmax = 0.15/0.25 = 0.6 Example 2:A1=B2=0.9 Actual Gametic Freq: Equilib. Gametic Freq: A1B1 A1B2 A2B1 A2B2 0.05 0.85 0.05 0.05 0.09 0.81 0.01 0.09 D = (A1B1 X A2B2) - (A1B2 X A2B1) D = (0.05 X 0.05) - (0.85 X 0.05) = 0.0025 - 0.0425 = -0.04 D’ = D/Dmax = -0.04/0.09 = -0.44 Linkage disequilibrium decays as recombination produces underrepresented gametes. The rate of decay depends upon recombination frequency as follows: Dt = D0(1 - c)t Linkage disequilibrium declines rapidly for unlinked loci, with approximate linkage equilibrium reached in five generations. Conversely, decay of disequilibrium is slow for closely linked loci. When linkage disequilibrium has been observed in a population, it has often been attributed to some type of multilocus selection. This assumption may not be valid because a number of other factors can affect linkage disequilibrium including: recombination genetic drift mutation gene flow inbreeding Expected heterozygosity (He) = Gene diversity: For a single locus with two alleles, He = 2pq When more than two alleles, it is simpler to Calculate He as: k He = 1 - pi2 i=1 Where k = number of alleles If sample sizes are smaller than 50 individuals: k He = 2N(1 - pi2)/(2N - 1) i=1 Where N is the number of individuals sampled. Gene diversity (He) is usually reported in Preference to observed heterozygosity as it is Less affected by sampling. Conservation biologists are often concerned with changes in levels of genetic diversity over time, as loss of genetic diversity is one indication that the population is undergoing inbreeding and losing its evolutionary potential. Heterozygosity is often expresses as the proportion of heterozygosity retained over time. Ht/H0 where Ht is level of heterozygosity at generation t and H0 is the level at some time earlier, referred to as time 0. For example, H0 may be the heterozygosity before a population crash and Ht after the crash. Then 1 - (Ht/H0) reflects the proportion of heterozygosity lost as a result of the crash.