Quantitative Genetics Hardy Weinberg Equilibrium Dr. S.A. Mohammadi Department of Plant Breeding & Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, Iran mohammadi@tabrizu.ac.ir The Hardy-Weinberg Principle (1908) Five H-W Equilibrium assumptions: If: 1. The population size is very large 2. Random mating is occurring 3. No mutation occurs 4. No selection occurs 5. No alleles transfer in or out of the population (no migration occurs) ➢ Then allele frequencies in the population will remain constant through future generations ➢ Genotype frequencies in progeny can be predicted from gene frequencies of the parents ➢ Equilibrium attained after one generation of random mating Simplifying Assumptions for Hardy-Weinberg Principle 1) diploid organisms 2) sexual reproducing organisms 3) generations are non-overlapping 4) all genotypes are equally viable ➢ If these simplifying assumptions are not met, it complicates the mathematics for the analyses ➢ Whether or not these assumptions are all met, biologists can use Mendelian ratios and Hardy-Weinberg analysis to measure rates of evolution H-W Equilibrium: Example ✓ ✓ ✓ ✓ 49 red-flowered RR 42 pink-flowered Rr 9 white-flowered rr [100 diploid individuals carry 200 alleles] 49 RR and 42 Rr offspring have (49 + 49 + 42 = ) 140 R alleles 42 Rr and 9 rr have (42 + 9 + 9 = ) 60 r alleles Freq of R = 140/200 = 0.7 and freq. of r = 60/200 = 0.3 No allele freq. change in the F1 The Hardy-Weinberg Principle (1908) Requirements of HW Evolution Violation Large population size Genetic drift Random Mating Inbreeding & other No Mutations Mutations No Natural Selection Natural Selection No Migration Migration An evolving population is one that violates Hardy-Weinberg Assumptions Steps of deduction in the proof of the H-W law, and the conditions that must hold • Based on stable allele frequencies and random mating • p2 +2pq + q2 Hardy-Weinberg Equilibrium (1908) Genes in parents A1 A2 A1A1 A1A2 A2A2 p q P = p2 H = 2pq Q = q2 0.4 0.6 0.16 0.48 0.36 Frequencies Example Genotypes in progeny ➢ Note: for multiple alleles, expected genotype frequencies can be found by expanding the binomial (p1 + p2)2 ➢ For example, for three alleles: 2 2 2 p + p + p = p + 2 p p + 2 p p + p + 2 p p + p ( 1 2 3) 1 1 2 1 3 2 2 3 3 2 Hardy-Weinberg Equilibrium (1908) Relationship between gene and genotype frequencies 1 ➢ f(A1A2) has a maximum of 0.5, ➢ Most rare alleles occur in heterozygotes ➢ Implications for ➢ F1? ➢ F2? ➢ Any BC? 0.8 Genotype frequency which occurs when p=q=0.5 0.9 A1A1 0.7 A2A2 A1A2 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Frequency of A2 0.8 0.9 1 Applications of the Hardy-Weinberg Law ➢ Use frequency of recessive genotypes to estimate the frequency of a recessive allele in a population ✓ Example: assume that the incidence of individuals homozygous for a recessive allele is about 1/11,000. q2 = 1/11,000 q 0.0095 ➢ Estimate frequency of individuals that are carriers for a recessive allele p = 1 - 0.0095 = 0.9905 2pq = 0.0188 2% Number of carrier = 206.8 207 Applications of the Hardy-Weinberg Law ✓ Frequency of the recessive: q2 = 1/20,000 = 0.00005 ✓ Calculate the q value: √q2 = √ 0.00005 = 0.007 ✓ Use the second equation: p + q = 1 to solve for p, p + 0.007 = 1, p = 0.993 Use p value to solve p2 and 2pq p2 = (0.993) 2 p2 = 0.986 2pq = 0.014, Number of carrier = 280 2pq = 2(0.993)(.007) Testing for Hardy-Weinberg Equilibrium All genotypes must be distinguishable Genotypes Gene frequencies A1A1 A1A2 A2A2 A1 A2 Observed 233 385 129 0.5696 0.4304 Expected 242.36 366.26 138.38 N = N11+ N12+ N22= 233 + 385 + 129 = 747 pˆ1 = N11 + 0.5 * N12 1 = (233 + 385 ) / 747 = 0.5696 N 2 2 E (N11 ) = pˆ1 * N = (0.5696 ) * 747 = 242 .36 2 Chi-square test for Hardy-Weinberg Equilibrium χ = 2 (Obs - Exp )2 = 1.96 Exp 2 critical χ1df = 3.84 O-E -9.36 18.74 -9.38 (O-E)2 87.61 351.19 87.98 (O-E)2/E 0.36 0.96 0.64 Σ((O-E)2/E) 1.96 c2, 1df 3.84146 2 Prob>c calculated value of chi-square critical value 0.16193 only 1 df because gene frequencies are estimated from the progeny data ➢ Accept H0: no reason to think that assumptions for Hardy-Weinberg equilibrium have been violated ✓ does not tell you anything about the fertility of the parents ➢ When you reject H0, there is an indication that one or more of the assumptions is not valid ✓ does not tell you which assumption is not valid Chi-square test for H-W Equilibrium: Example 1 In a population with 200 genotypes, for a locus with an A/G SNP polymorphism, the number of individuals with AA, AG and GG are 40, 70 and 90, respectively. Is the population in H-W equilibrium for this locus? ✓ The proportional deficiency of heterozygotes (F) can be calculated based on observed heterozygosity (0.350) and expected heterozygosity (2pq = (2) (0.375)(0.625) = 0.469. F = (0.350 - 0.469) / 0.469 = 0.253. ✓ A deficiency of heterozygotes is also called the Inbreeding Coefficient (F), if it is attributable to selective union of similar gametes, and (or) selective mating of similar genotypes. Chi-square test for H-W Equilibrium: Example 2 For a locus with three alleles A, C and T, the number individuals for six possible genotypes are as table. Is the population in H-W equilibrium for this locus? ✓ The same principles of two alleles locus are applied to calculations for a threealleles locus. Exact Test for Hardy-Weinberg Equilibrium ➢ Chi-square is only appropriate for large sample sizes ➢ If sample sizes are small or some alleles are rare, Fisher’s Exact test is a better alternative N N! n A ! na !2 Aa Pr(N AA ,N Aa ,Naa n A , na ) = N AA ! N Aa ! Naa ! (2N )! ✓ Calculate the probability of all possible arrays of genotypes for the observed numbers of alleles ✓ Rank outcomes in order of increasing probability ✓ Reject those that constitute a cumulative probability of <5% Exact Test for Hardy-Weinberg Equilibrium Observed: 9 AA, 1 Aa, and 30 aa genotypes N N! nA ! na !2 Aa Pr(N AA ,N Aa ,Naa nA , na ) = N AA ! N Aa ! Naa ! (2N )! N 40 40 40 40 40 40 40 40 40 40 NAA 9 8 7 6 5 0 4 1 3 2 NAa 1 3 5 7 9 19 11 17 13 15 Naa 30 29 28 27 26 21 25 22 24 23 nA 19 19 19 19 19 19 19 19 19 19 na 61 61 61 61 61 61 61 61 61 61 Probability 0.0000 0.0000 0.0001 0.0023 0.0205 0.0594 0.0970 0.2308 0.2488 0.3411 Cumulative Probability 0.0000 0.0000 0.0001 0.0024 0.0229 0.0823 0.1793 0.4101 0.6589 1.0000 Reject H0 Reject H0 Reject H0 Reject H0 Reject H0 Accept H0 Accept H0 Accept H0 Accept H0 Accept H0 Likelihood Ratio Test ( ) ( ) L r z = L z • Maximum of the likelihood function given the data (z) when some parameters are assigned hypothesized values • Maximum of the likelihood function given the data (z) when there are no restrictions When the hypothesis is true: ( ) ( ) LR = −2 ln = −2 L r z − L z c2 df=#parameters assigned values Likelihood ratio tests for multinomial proportions are often called G-tests (for goodness of fit) Lynch and Walsh Appendix 4 Likelihood Ratio Test for HWE Nˆ ij G = −2 Nij ln Nij i =1 j i n n where N̂ ij is the expected number and Nij is the observed number of the ijth genotype Genotypes A1A1 A1A2 A2A2 Observed 233 385 129 Expected 242.36 366.26 138.38 O*ln (E/O) 9.1768 - 9.0546 8 19.211 5 G= 1.96 Prob>c 0.1615 2 3 Exercise Consider a population of wildflowers that is incompletely dominant for color: ✓ 320 red flowers (CRCR) ✓ 160 pink flowers (CRCW) ✓ 20 white flowers (CWCW) ➢ Examine the H-W equilibrium of this locus in the population under study HW: Does it Matter PlantBreeder? HW: Does it Mattertotothe the Plant Breeder? ➢ All selection theory that is the underpinning of plant breeding based on HW populations ➢ One classic example of an HW population is the Aztec farmer’s field of open pollinated (OP) maize ✓ Each plant has the opportunity to mate with any other plant ✓ OP varieties were commonly grown before hybrids took hold ✓ OP landraces are still common in developing ➢ The other classic example of an HW pop is the F2 generation of a self pollinated crop like wheat or soybean ✓ In F2, the frequency of the A and the a alleles are 0.5 and the frequency of AA, Aa and aa are 0.25, 0.5 and 0.25, respectively