BIO 140 Lecture 11 Lecture 11 Introduction to Population Genetics: Genetic Variation Genetics •branch of biology concerned with heredity and variation Population Genetics Population Genetics •genetics at the population level •branch of genetics which deals with the behavior of genes in population Population Genetics •the study of polymorphism and divergence 1 BIO 140 Lecture 11 Important topics within population genetics Genetic Variation • genetic heterogeneity in a population • Genetic structure of populations • Hardy-Weinberg Equilibrium • Variation in natural populations • enables the species to adapt to future novel changes in the environment • Forces that change gene frequencies • raw material for evolution • Genetic variation in space and time • Speciation and the role of genetics in conservation biology How is genetic variation measured? Isozymes • univariate and multivariate statistics • functionally similar but separable forms of enzymes encoded by one or more loci • use of isozymes •use of other genetic markers such as RAPDs, RFLP, AFLP, minisatellites, microsatellites, SNPs, CNVs, DNA sequences of mtDNA and nuclear markers Isozyme analysis: monomeric enzyme Genotype: FF SS FS Isozyme analysis: dimeric enzyme Genotype: FF SS FS Banding Banding pattern: pattern: = F (fast allele) = F (fast allele) = S (slow allele) = S (slow allele) 2 BIO 140 Lecture 11 How are isozymes used to describe population genetic structure? Isozyme analysis: tetrameric enzyme ▪ Allele frequency ▪ Average number of alleles per locus Genotype: FF SS ▪ Percentage of polymorphic loci FS ▪ Individual heterozygosity Banding ▪ Average heterozygosity over all loci pattern: ▪ Nei’s genetic distance coefficients = F (fast allele) = S (slow allele) mtDNA markers Measures of genetic variation using DNA markers ▪2 rRNAs ▪ Polymorphism = % of loci or nucleotide positions showing more than one allele or base pair. ▪22 tRNAs ▪13 proteins 16 – 17 kb ▪ Heterozygosity (H) = % of individuals that are heterozygotes. mtDNA mitochondrion Measures of genetic variation using DNA markers ▪ Allele/haplotype diversity = measure of # and diversity of different alleles/haplotypes within a population. ▪ Nucleotide diversity = measure of number and diversity of variable nucleotide positions within sequences of a population. Calculating intrapopulation nucleotide diversity ▪ Genetic distance = measure of number of base pair differences between two homologous sequences. ▪ Synonymous/nonsynonymous substitutions = % of nucleotide substitutions that do not/do result in amino acid replacement. (πX) measures the average weighted sequence divergence between haplotypes 3 BIO 140 Lecture 11 Using sequence data: intrapopulation nucleotide diversity Calculating haplotype diversity (H) • a measure of the frequencies and number of haplotypes among individuals • tells about the degree of nucleotide diversity among several sequences in a given region of the genome • equivalent to the measure of allelic diversity within a locus • ranges from 0 to 1 H= n 1 − xi 2 n −1 1 xi = relative haplotype frequency of each haplotype n = sample size Calculating intrapopulation haplotype diversity Interpreting haplotype and nucleotide diversities Population 1 Large H (≥ 0.5) Small H (<0.5) H= n 1 − xi 2 n − 1 1 Population 2 f(seq1)=0.2 f(seq2)=0.2 f(seq3)=0.2 f(seq4)=0.3 𝐻𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 1 = 10 1 − (0.52 + 0.22 + 0.12 + 0.22) 9 𝐻𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 1 = 0.73 10 𝐻 1 − (0.22 + 0.22 + 0.22 + 0.32) Population 3 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 2 = 9 𝐻 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 2 = 0.88 f(seq1)=0.9 10 f(seq3)=0.1 𝐻 1 − (0.92 + 0.12) 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 3 = 9 𝐻𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 3 = 0.2 Calculating allelic frequencies Calculating allelic frequencies A hypothetical population consists of 1000 individuals. The genotypic frequencies for the MN blood typing of the population are as follows: frequency rel. freq. MM 300 0.3 MN 600 0.6 Compute for the allelic frequencies. ▪ Recent population ▪ Population bottleneck bottleneck followed by rapid Small π (<0.5%) ▪ Founder event by a population growth and single or a few mtDNA accumulation of lineages mutations ▪ Divergence between ▪ Large stable population geographically with long evolutionary Large π subdivided populations history (≥ 0.5%) ▪ Secondary contact between differentiated lineages NN 100 0.1 frequency rel. freq. MM 300 0.3 MN 600 0.6 NN 100 0.1 A. Counting the number of each allele 𝑓 𝑀 =𝑝= 𝑓 𝑁 =𝑞= 2 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑀 +𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 2 300 +600 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) = 2(1000) =0.6 2 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑁 +𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 2 100 +600 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) = 2(1000) =0.4 4 BIO 140 Lecture 11 Calculating allelic frequencies frequency rel. freq. MM 300 0.3 MN 600 0.6 Calculating allelic frequencies NN 100 0.1 frequency rel. freq. A. Using relative genotypic frequencies 𝑓 𝑀 =𝑝= = 2(1000) =0.6 (2)𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑀 𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 𝑓 𝑀 =𝑝= + 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 𝑓 𝑀 =𝑝= MN 600 0.6 NN 100 0.1 A. Using relative genotypic frequencies 2 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑀 +𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 2 300 +600 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) MM 300 0.3 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑀 1 𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) + 𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠 2 (𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 1 𝑓 𝑀 = 𝑝 = 𝑟𝑒𝑙. 𝑓𝑟𝑒𝑞. 𝑜𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑀 + 𝑟𝑒𝑙. 𝑓𝑟𝑒𝑞. 𝑜𝑓 ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠 2 1 𝑓 𝑀 = 𝑝 = 0.3 + 0.6 = 0.6 2 𝑓 𝑁 =𝑞= 2 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑁 +𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 2 100 +600 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) = 2(1000) =0.4 (2)𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑁 𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) 𝑓 𝑁 =𝑞= + 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 2(𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 𝑓 𝑁 =𝑞= 𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑁 1 𝑓(ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠) + 𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠 2 (𝑡𝑜𝑡𝑎𝑙 𝑛𝑜. 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠) 1 𝑓 𝑁 = 𝑞 = 𝑟𝑒𝑙. 𝑓𝑟𝑒𝑞. 𝑜𝑓 ℎ𝑜𝑚𝑜𝑧𝑦𝑔𝑜𝑢𝑠 𝑓𝑜𝑟 𝑁 + 𝑟𝑒𝑙. 𝑓𝑟𝑒𝑞. 𝑜𝑓 ℎ𝑒𝑡𝑒𝑟𝑜𝑧𝑦𝑔𝑜𝑡𝑒𝑠 2 1 0.6 = 0.4 𝑓 𝑁 = 𝑞 = 0.1 + 2 5