4/12/23, 2:11 PM BIOL 4160 03 Variation BIOL 4160 Evolution Phil Ganter 301 Harned Hall 963-5782 Pollen newly released from the maturing flowers on this spathe Genetic (and some Phenotypic) Variation Email me Link to a list of Specific Objectives for lectures Back to: Academic Page Tennessee State Home page BIOL4160 Page Ganter home page Types of Gene Variation Chromosomal and Genomic Variation Mutation, Variation and Randomness Recombination and Variation Hardy-Weinberg Variation Within Populations Variation Between Populations https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 1/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Types of Gene Variation Structure of Genetic Material Gene Structure Alleles are varieties of Genes found at a particular Locus in the Genome But the term gene is used loosely by many concept of gene has expanded and altered well beyond the one-gene, one protein hypothesis Types of Genes Protein Coding Genes Coding Sequence and Exons Alternative Splicing multiple exons can be combined in various ways to produce multiple proteins from one gene Exon Shuffling - constructing new genes by combining exons from different loci Introns (INTRagenic regiON - Walter Gilbert) present in eukaryotic pre-mRNA (also called hnRNA - heterogeneous nuclear RNA) Four classes Self-Splicing Group I and Self-Splicing Group II Introns both important in organelle genes and Group II important in rRNA (ribosomal RNA) processing, both are spontaneous splicing, Type II requires guanine nucleoside as catalyst Spliceosomal Introns and Enzymatically-spliced Introns Spliceosome is a complex of proteins and RNAs, mechanism related to Group II self-splicing Enzymatically spliced mechanism not like other splicing mechanisms Enhancer Region(s) where Transcription Factors (both enhancers and repressors) bind to regulate RNA polymerase binding can be a long way from rest of gene as intervening DNA can loop and bring enhancer site close to promoter Promoter Site where RNA polymerase binds to DNA Poly-Adenylation Addition Site binds the cleavage complex (that cleaves the RNA) and polyadenylate polymerase (that adds up to 250 or so A's) Mature mRNA is stabilized by 5'-GTP Cap and the 3' poly-A tail Non-Coding RNA Genes (only a partial listing) Ribosomal RNA (rRNA) Genes for RNA]'s found in ribosome Cleaved from a single RNA molecule and the gene for this large molecule is Tandemly Repeated Transfer RNA (tRNA) Genes for binding to Amino Acids https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 2/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Telomerase RNA - part of complex responsible for building telomeres RNA Genes for RNAs involved in gene regulation Antisense RNA (aRNA) Genes that bind to mRNAs MicroRNA (miRNA) Genes average only 19-25 bp, in eukaryotic cells only, and are post-transcriptional regulators that bind to complementary sequences on mRNA less than 100% complementarity so they can bind to multiple transcripts (gene silencing) human genome has about 1000 miRNAs Small Interfering RNA (siRNA) Genes that help regulate protein production in most eukaryotes through the RNA interference (RNAi) pathway (21-22 bp) 100 % complementarity and so target specific genes usually act by cleaving mRNA (gene silencing) and may be important immunity genes in organisms without cell-mediated immunity (plants, non-mammal animals) Long NC RNA Genes are regulatory but act in multiple ways Several RNA species that seem to bind to invasive nucleic acids (piRNA against transposon activity is an example) untranslated small RNA's Genome Structure Genomes are still being explored and much data is only preliminary Bioinformatics is a series of mathematical tools and programs used in the exploration Annotation is the process of identifying gene sequences Genome size varies greatly Prokaryote genomes usually less than 10 Mbp (mega base pairs or 1 million base pairs) but eukaryotic genome sizes can be over 100 billion base pairs (100 Gbp) Synteny is the order of the loci on chromosomes (actually means that the same loci should be found on homologous chromosomes) Synteny is altered when translocations and duplications alter genes Single Copy sequences Single copy sequences include both Protein-Coding Genes and Non Protein-Coding Genes (some small amount of repetition may occur here) Repetitive DNA sequences rDNA Tandem Repeats (ITS = Internal Transcribed Spacer), multiple copies of tRNA genes Microsatellites short repeated sequences (microsatellites = simple sequence repeats or SSRs) - 2-8 base pairs, tandem repeats (# of repeats variable) used to map an allele Autonomously Replicating Sequences (large diverse group, only some discussed here) https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 3/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Transposons lots of them, some are DNA and some are RNA Can cause mutations by inserting into a gene or a gene's control regions Retrotransposons are most common and code for several proteins (including Reverse Transcriptase or RNA-dependent DNA Polymerase) LINES (long interspersed nuclear elements) repeated sequences up to 1000 to 5000 bp in length and up to millions of copies Some have both Reverse Transcriptase and Integrase genes and can replicate like any other transposon and some have lost that function and no longer replicate themselves Differ from transposons in that they do not move from place to place in the genome but make copies of themselves and, when the copies integrate, the genome size increases Most copies lose functionality, so they do not copy themselves Evidence for evolution of a balance between LINE expansion and the negative effect of so much unproductive DNA is scarce There is evidence that some genes resulted from integration of a LINE (or a SINE) into a pre-existing gene, so LINES and SINES may have a role in producing genes with novel functions SINES (shorter than LINES) are less than 500 bp long and can occur in millions of copies in some genomes SINES never code for RTase and so only spread in the genome when other transposons or LINES provide the means to do so ALU elements most common repeat in humans - contain ALU-1 restriction enzyme site (hence the name) and were mutated from an RNA that functions in the signal-recognition particle (part of the mechanism for targeting mRNAs to the endoplasmic reticulum) ALU insertions are implicated in some human diseases Latent Viruses It's a continuum from virus' that never integrate and always have protein coats to latent viruses and virus-like sequences that replicate Gene families (based on protein families) A set of related loci formed by local gene duplications Usually have similar functions (LDH family, Hemoglobin family) Pseudogenes are genes that have lost function through mutation may be important in gains of novel functions as the can accumulate mutations while not functional and eventually mutate back to functionality many gene families have related pseudogenes (Hemoglobin family) Mutations Point / Indel Mutations and Third Codon Degeneracy https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 4/17 4/12/23, 2:11 PM BIOL 4160 03 Variation (Transversion, Transitions, Synonymous and Non -Synonymous, Frame Shift, Sequence Termination by creation of Stop Codon) INDEL (insertion or deletion) mutations often mean loss or alteration of function Single-Nucleotide Polymorphisms (SNPs) are useful mapping markers and can label individuals Neutral Mutations have no effect on the fitness of the organism Synonymous nucleotide substitutions (usually 3rd position of the codon) are neutral because there is no alteration of the phenotype If a mutation alters a protein, it may do so in such a way as to not alter the protein's function, so this would also be a neutral mutation Neutrality is probably most common outcome Deleterious Mutations decrease fitness Pleiotropic effects - a gene that affects more than one phenotypic trait (eye color mutations of Drosophila exhibit pleiotropy) Beneficial mutations occur at a low rate but this is expected in Darwin's gradualist view of evolution Epistasis is when two or more loci interact in their effects on a phenotype Recombination Mutations Gene conversion using one allele to alter the other allele so that it is identical to the first (the second has been converted to the first) happens during crossing-over and is caused by mismatch repair of incorrectly paired DNA strands Intergenic Recombination - crossing-over of short sequences within a larger gene sequence Unequal Crossing Over causes one DNA strand to lose a section and the other to gain a section the DNA strand that gained can then have a Gene Duplication Most common where there already is a tandem repeat (misalignment of repeats) Probably responsible for much of the amplification of LINES and SINES Transposons and other repeated sequences can cause Rearrangements if paralogous copies align during synapsis Forward and Back Mutations These terms are "pre-sequencing" when all gene mutations were detected by changes in phenotype Forward mutations are more common than back (many things can change an allele but only a very few specific changes will restore it) Variation in gene structure vs variation in gene expression variation in both leads to phenotypic variation Mutation rate (usually point mutations) https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 5/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Estimated in lab experiment or by comparison of orthologs in different lineages (usually different species) Need to reduce the chance that natural selection has altered the rate and measure just the changes caused by random error Concentrate on pseudogenes, other untranscribed sequences, and four-fold degenerate sites Key to this is to understand that: Each substitution produces two alleles at that site One is very common (wild type) and the other is represented by one copy in the population or species The chance that mutation will become so common that it replaces the original allele is, if only random chance is involved, equal to the rate at which mutations arise (more on the when we cover genetic drift) So, if we count the number of replacements, we can estimate the mutation rate There is a need to correct for the chance that, at any site that mutated, a second mutation occurred that changed it to another base or restored the original base Empirical measurements indicate that mutation rates vary among lineages, loci, and even within loci Remember that time here is difficult to compare between different organisms Different generation times mean that, for a given number of years, some lineages will have more opportunities to mutate (which happens when the genomes are replicated Only want to consider germ line cells, not somatic cells That said, the variation in rate (per base pair per replication) is not so large (probably due to similarity to replication process in all organisms) chance of mutating is about 0.3 x 10-10 to 6 x 10-10 per base pairs per replication If we assume a rate of 3 x 10-10 for our genome, then each time we replicate it (7 x 109 base pairs in our diploid genome), there is only a 1 in 5 chance of getting any changes at all But, from the zygote to the egg or sperm, there is at least 100 replications, and so we can expect at least 20 mutation in each gamete Mutagens are chemicals that alter the chance of mutations (always increase the chance), which is why we try to restrict releasing some chemicals into our environment Chromosomal Variation Ploidy Changes Aneuploidy, Polyploidy (Allopolyploidy, Autopolyploidy) https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 6/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Inversions Translocations Fission and Fusion Karyotype Variation Figure 1 - Karyotypes of 36 strains of an asexual yeast, Candida sonorensis, showing the sorts of extreme karyotype variation found with asexual "species." It is often hard to be sure that a yeast is truly asexual but it is hard to see how synapsis during meiosis could be achieved between some of the strains of this yeast. There are no known phenotypic differences between the strains listed below. This raises two related questions. First, are we missing important parts of the phenotype? If not, where did the extra DNA in the larger genomes come from and what does it do? https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 7/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Mutation, Variation, and Randomness Variation is ultimately the outcome of mutation Mutation is a random process Thus, mutations do not happen when the will help an organism -there is no "directed mutation" However, development (and other constraints) may mean that not all conceivable phenotypes are possible https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 8/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Recent challenges to this assertion have been shown to be wrong and have reinforced the assumption of randomness in mutations Mutation rate is not a random effect Different lineages have different rates of synonymous mutation Different regions of the genome have different mutation rates Environmental conditions can alter mutation rate Although mutations are random, variation is the outcome of many mutations and is predictable! Natural selection, in cases where there is a single allele or combination of alleles, genetic drift, and inbreeding will work to reduce variation in a population, which can only be replenished by migration of new alleles or mutation Problem: When lab populations of animals are subject to artificial selection, the most common response it that the character being selected changes in the direction encouraged by selection. So, one can reduce the number of facets (individual visual cells) in the eye of a Drosophila by selecting for this by only allowing flies with the fewest facets to contribute eggs to the next generation. Wim Scharloo did a laboratory selection experiment with a population of 1000 flies. After several generations of selection during which the character selected changed, change stopped. Prof. Scharloo then made one change in the experiment. He increased the population size from 1,000 to 10,000. Selection almost immediately became effective again and the character continued to change. Why did the expansion of the population restart the evolutionary process? Mutation and Fitness Neutral mutations are those that do not affect fitness (no matter why) Pleiotropy when the output from a particular locus affects more than one character, the gene's effects are said to be pleiotropic a mutation's effect must be assessed for all characters affected by that locus as the mutation's effects may differ by character affected For those mutations that increase or decrease fitness, remember that fitness is not the property of an allele, but the outcome of an allele in a particular environment (both physical and biological environment) A mutation that alters coloration of prey is only as important as the risk of being eaten It is assumed that it is easier to harm a complex machine by randomly changing its parts than to improve it by random change (mutation), so harmful mutations are expected to be more common than beneficial mutations Text example of bacterial evolution in which 1 in 150 mutations were beneficial (and the average fitness increase was 3% is surprising for how many beneficial mutations occurred and for how beneficial they were https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 9/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Phenotypic Variation This is an extensive subject and all we will do here is to point out some basic relationships between genetic variation and phenotypic variation Phenotypic variation is the degree of differences between the physical characteristics of related organisms Sources of Phenotypic Variation Genetic Variation - discussed previously Environmental Variation - differences among individuals due to the influence of their environment (including their biotic environment) Usually measured by measuring phenotypic differences when the genotype is held constant Developmental Noise - the differences in individuals of the same genotype raised under identical environmental conditions Maternal Effects - these are differences caused by non-genetic influence of the mother on her offspring Variation among ova (not DNA, but differences in the stocking of the egg with energy and food resources, specific proteins and RNAs Variation in mother's condition when producing eggs or carrying offspring Variation in maternal care (this can be due to father as well) Epigenetic Inheritance - differences in genetic expression of a locus not based on sequence differences among alleles Liver cells, in culture, undergo mitosis but produce only liver cells - they do not revert to zygote or stem cell status Genetic imprinting - dealt with in Evo-Devo chapter Describing Phenotypic Variation - the measure used is Variance (from statistics) in a character, which is a measure of the deviation of individuals from the mean character value (assumes one can use numbers to measure the character) At the simplest level, the variance in a trait within a population or species can be divided into two additive portions: Phenotypic Variation (Vp) = Genetic Variation (VG) + Environmental Variation(VE) Phenotypic plasticity - a single genotype often can produce more than one phenotype if the environment in which the organism develops changes - this is the Reaction Norm of that genotype (all possible phenotypes from a single genotype) In this case (which may be the usual case), then we must alter the partitioning of phenotypic variance: Phenotypic Variation (Vp) = (VG) + (VE) + Genotype x Environment Interaction (VGxE) Recombination and Variation https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 10/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Parasexual recombination (conjugation, transduction, transformation) Horizontal versus Vertical Transmission Recombination at the molecular level Homologous and Non-homologous Sexual recombination combinations of genes are not preserved unless the genes are closely linked (no linkage is ever tight enough to completely prevent recombination) Recombination can be intergenic Recombination produces new combinations of genes each generation To preserve favorable combination of genes, some other process must operate (positive assortative mating is one possibility) Linkage Physical linkage means that the loci are close enough on a chromosome that they are likely to be inherited together If two loci each have two alleles in a population and the proportion of each allele is 0.50, then unlinked genes should be in Linkage Equilibrium in this case, 25% AB, 25% Ab, 25% aB, and 25% ab, Linkage Disequilibrium is a significant departure from the proportions expected from linkage equilibrium In the case above, if Ab is one chromosomal type in the population and aB is the other (and no recombination occurs because the linkage is so tight) you get 50% Ab, and 50% aB (no recombinant allele parings [AB or ab] are formed) Linkage disequilibrium, then, is a measure of the inhibition of recombination and indicates some evolutionary process may be affecting the outcome of recombination (assortative mating, selection, etc.) in addition to simple physical linkage Hitchhiking - when one allele is changing frequencies due to selection (for or against), neighboring alleles may also change if closely linked Hardy-Weinberg Variation is a population-level phenomenon (emergent property of populations) and a necessary condition for evolution What should we expect to happen over time when variation exists in a population? Hardy-Weinberg expectations are predictions of future population variation when that variation is not altered by ecological or statistical processes https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 11/17 4/12/23, 2:11 PM BIOL 4160 03 Variation p2 + 2pq + q2 H-W Assumptions - Hardy-Weinberg predicts no change but is only accurate if its 5 assumptions are met. Below we list the assumptions and discuss what happens when the assumption is violated. No mutation, Mutations generate differences between generations and upset H-W prediction No migration, If populations differ in their genetic composition (maybe A is 90% of the genes at a locus in one population and only 10% in another population), migration between the populations can change their genetic composition Random Mating, Assortative mating (also called Non-Random Mating) Positive Assortative Mating - if like mates with like (due to choice or to small population sized not allowing much choice) then intermediates and heterozygotes are lost - a decrease in genetic variation Negative Assortative Mating - like mating with unlike will increase the proportion of heterozygous intervals and preserve genetic variation Inbreeding has the same effect as positive assortative mating - loss of genetic variation two related individuals are more likely to have a rare allele, given that one of them does, than two individuals chosen at random from the entire population, thus rare recessive alleles are more likely to become homozygous in inbred offspring can (not must, but can) lead to lower viability of inbred individuals or to lower fecundity Heterosis - condition where the heterozygous individuals show greater fitness (viability, fecundity) than do individuals homozygous for either of the alleles Inbreeding Depression - loss of fitness due to inbreeding as more and more recessive, less fit alleles are expressed due to inbreeding more likely in small populations often there are physical or behavioral barriers to inbreeding Large Populations, https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 12/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Genetic Drift loss of genetic variation due to chance events more likely in small populations than in large Neighborhoods can enhance the effect of drift if populations are subdivided into small neighborhoods, then drift will be more important for the entire population Bottleneck - a sudden low point in populations numbers, followed by expansion of the population Bottlenecks can reduce genetic variation in a generation through genetic drift, even though population numbers are generally high If you come along when the population has recovered its large size, you would think that genetic drift was not important in that population, but a recent bottleneck event might have greatly reduced genetic variation in your study population. Founder Effect if new populations are formed by the migration of just a very few individuals, the population can be said to have gone through a bottleneck at its founding founder effect can mean that new populations are different from parent populations through chance alone No Selection Natural selection is the outcome of fitness differences between individuals Natural selection requires that there is heritable genetic variation in a population if some of those genetic variants are more fit (better able to survive and reproduce) than others, the fit genetic variants will leave more offspring that also have their "fit" genotypes as time goes on, more of the population are descended from the more fit individuals An example - Peppered Moth melanic forms favored when trees are darkened, light form when trees are lighter selective factor is mortality due to bird predation melanic gene has other effects, but none are strong enough to explain the population changes seen in England in the US, melanic form has declined even though trees are not becoming lichen covered, so NS by bird predation may not work for all cases of Industrial Melanism Prevalence of resistance to herbicides, insecticides, rat poisons, and antibiotics are also examples of natural selection Natural Selection can enhance, reduce, or maintain variability https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 13/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Natural selection can, under the right conditions, favor polymorphism (two or more alleles or phenotypes in a population) can result in a Balanced Polymorphism if each phenotype has an environment in which it is most fit form Cepaea snail's (a large land snail) shell banding varies with the background and can hide the snail from bird predation populations are made up of different forms, each form with an environment in which it is the fittest natural selection favors more than one phenotype within a single population here Natural selection can have different effects on a population, which we have divided into three "modes of selection." Disruptive (Diversifying) when the extremes are fittest and intermediates are less fit Can split a population into two phenotypes with few intermediate forms Stabilizing when the fittest individuals are the average, then those with more extreme (larger and smaller) phenotypes are less fit and NS will act to reduce the number of individuals with extreme phenotypes Directional when a new, fitter type originates, the population will move from the older type to the newer type over time Natural Selection produces Adaptations Adaptations are those characteristics of organisms that allow one organisms to be more fit than another Populations adapt to environments as natural selection increases the proportion of individuals that have the most fit adaptation All three modes of selection (disruptive, directional, and stabilizing) will produce adaptation (in the case of disruptive, more than one adaptation). Variation Within Populations Are all differences among individuals in a population heritable genetic variation? Phenotypic Variation (Vp) = Environmental (VE) + Genetic (VG) Genes may have different effects when in different environments many genes are expressed differently when temperature differs expression of many genes depends on genetic environment - what alleles are present at other loci - dominance is a good example of this effect Therefore, we must added a term for gene-by-environment interactions (VG+E) https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 14/17 4/12/23, 2:11 PM BIOL 4160 03 Variation Phenotypic Variation (Vp) = Environmental (VE) + Genetic (VG) + Interaction (VG+E) Heritability Proportion of phenotypic variation that is due to genetic variation h2 = VG / (VG + VE) Note that the interaction term is not used and, if significant, makes heritability harder to measure and discuss Often estimated through the slope of the line describing the relationship between the measure of a character in offspring versus the mean of the parents (Midparent Mean) Reaction Norms A reaction norm is the change in phenotype produced by a single genotype in different environments This is a way to quantify Gene x Environment interaction often a scattergram with the phenotypic measure as the y-axis and the different environments (or range if the differences are continuous, like temperature differences) on the x-axis each genotype gets its own line and interactions are revealed when lines are not parallel Variation Between Populations We have already discussed the geographic relationship among populations (allopatry, peripatry, parapatry - no sympatry for populations of the same species!!) when discussing speciation Subspecies = Geographic Race Clines form between extremes of populations or between parapatric populations Adaptive Geographic Variation and Gene Flow 1. AGV adapts a local population to its specific, local habitat 2. Gene Flow counteracts AGV by homogenizing gene frequencies in a population or between local populations Countergradient Variation https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 15/17 4/12/23, 2:11 PM BIOL 4160 03 Variation a plant found in both harsh and benign environment grows slowly in harsh environment and quickly in benign environment experiment - grow seeds from both populations in the benign environment seeds from population in harsh environment grown faster than seeds from population in the benign environment Environmental variation causes a gradient in growth rate Genetic variation produces a counter-gradient in growth rates due to natural selection for faster growing plants in harsh environment But, since the environmental effect is larger, one observes that plant grown more slowly in harsh environment (difference would be even greater without the genetic countergradient) Character Displacement Variation among populations of a species as a result of some populations being sympatric with a related competitor species (or within populations in which gene flow is limited by distance and part of the population is sympatric and part is not) Character is displaced (=altered) by the effect of competition with the related species for resource, not by a change in overall resource availability (see book for examples) F-statistics Variation among individuals in a species can be subdivided into within-population and between-population components FST is a measure of the proportion of variation among individuals at a locus due to differences between populations and it ranges from 0 (no difference in allele frequencies) to 1 (different alleles fixed in each population) There are several ways to calculate and/or estimate this and we will examine one here based on a locus with two alleles only (in all populations) To calculate this, it is necessary to know the frequency of the alleles in each population, from which you can calculate the mean (q-bar) and variation in q (VAR). This equation will be 0 if there is no variation among populations (numerator = 0) and 1 when that variation is as large as the product of the average frequencies of the two alleles (1 - q is the frequency of the other allele when only two are present) Last updated March 1, 2011 https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html 16/17 4/12/23, 2:11 PM https://ww2.tnstate.edu/ganter/BIO416 03 Variation.html BIOL 4160 03 Variation 17/17