Nancy butterfly

advertisement
Nancy Schoeppner
Genetics 303
11/4/11
The Application of Genetic Techniques in Wild Butterfly Populations inspired by
Functional Genomics of Life History Variation in a Butterfly Metapopulation
The Glanville fritillary butterfly, Melitaea cinxia (Nymphalidae), has been the subject of an
extensive long term study to understand how population dynamics function on the landscape scale. The
Glanville fritillary butterfly is found on the Aland Islands, which are off the coast of Finland. These
islands contain approximately 4000 suitable habitats that can support butterfly populations. The
Glanville fritillary lays its eggs on two specific host plants, Plantago lanceolata and Veronica spicata,
which the larval caterpillars eat to grow and metamorphose into the adult butterflies. These host plants
are limited to thin soil habitats bordering granite outcrops, which in turn limit the distribution of the
butterflies (Klepsatel and Flatt 2011). Illka Hanski and his colleagues have been surveying the 4000
habitat sites since 1993, recording which sites had a butterfly population and the size of that population.
The results of this long-term survey have shown that only a fraction of the 4000 suitable habitats
contain butterflies at any time, making a mosaic of populations across the landscape (Metapopulation
Research website). As sites become overpopulated and competition increases, butterflies migrate from
inhabited sites to colonize new populations in uninhabited sites. This process of stochastic extinction
and recolonization on the landscape scale is referred to as metapopulation dynamics. Given the process
of founding populations followed by population growth; metapopulation dynamics provides a source of
stabilizing selection where butterflies that are good dispersers (high metabolism, high endurance, early
egg production, high contribution to egg production) are successful in founding “new” populations and
butterflies that are good competitors (high efficiency of resource use, later reproduction, longer
reproduction) are more successful in existing “old” populations (Wheat et al. 2011). While this
interpretation of how selection is operating in a natural population makes good logical sense, it has not
been verified using studies of DNA sequence variation or gene expression in both “new” (recently
colonized) and “old” (existing) populations . The major stumbling block that has prevented this type of
research in most natural populations is the lack of a sequenced genome and microarrays to look for
patterns of variation in gene expression within and among populations. Until recently, genome
sequences have not been available for many non-model organisms. Genome sequences have been
available for Drosophila, C. elegans, and Arabidopsis but these model organisms have been used more
often for basic research questions about the role of genes, and less so for studies of evolution in natural
populations. The goal of this paper is to describe how a group of researchers has developed an
extensive understanding of the interplay between genetics, evolution, and ecology using the Glanville
fritillary butterfly and a related butterfly species. I begin by describing how variation in the sequence
and structure of phosphoglucose isomerase (Pgi) was linked to differences in the fitness of Colias
eurytheme, a butterfly that that is in the Pieridae family which is a sister taxa to the Nymphalidae. Next I
discuss how a description of the transcriptome for M. cinxia was developed using a new sequencing
technique (454 pyrosequencing), and then how that transcriptome was used to detect balancing
selection acting on Pgi in the M. cinxia populations living on the Aland Islands.
Phosphoglucose isomerase (Pgi) is a gene that codes for the enzyme (PGI) that catalyzes the
second reaction in glycolysis where glucose 6-phosphate is converted into fructose 6-phosphate.
Glycolysis is needed to convert glucose into pyruvate, which is then used by the mitochondria to
produce ATP. PGI is a protein dimer that is formed through the interaction of two polypeptide chains.
The polypeptide chains interact to form the active site where fructose 6-phosphate attaches to the
enzyme to be converted into fructose 6-phosphate. One particular amino acid residue, His 392, has
been identified as being particularly important to the enzyme’s function. The histidine at position 392
on each polypeptide chain is positioned such that it is involved in forming the catalytic site of the other
polypeptide chain. This interaction between the two polypeptide chains in forming the catalytic site is
important because multiple alleles for Pgi have been identified in butterfly populations (Wheat et al.
2006). This means that the performance of the enzyme in individuals with different genotypes is likely
to be different because of the way that the two polypeptide chains fit together. Some combinations of
alleles will produce polypeptide chains that fit together better than others, which will in turn make some
enzymes more effective than others. The organisms that contain the more effective enzymes should
also have more efficient ATP production which should translate into fitness effects.
The sequence of PGI is highly conserved among other taxa. For example, butterfly PGI shows
69% sequence identity to both rabbit and human PGI sequences (Wheat et al. 2006). Because of this
similarity, the shape of the PGI, which has been determined for rabbit and human PGI, can be used to
predict the 3-D shape of the butterfly PGI. The shape of the proteins produced by these different alleles
can then be analyzed to determine why some versions of the enzyme are more efficient than others.
This analysis will provide information about which mutations in PGI are produce important differences in
performance among different butterfly genotypes.
Wheat et al. (2006) used DNA sequencing to identify 4 different Pgi alleles in a population of C.
eurytheme that was collected in Tracy, California. Averages of synonymous (nucleotide changes that do
not change the amino acid sequence) and nonsynonymous (nucleotide changes that do alter the amino
acid sequence) were found for the alleles. The average number of synonymous differences found
between pairs of the four alleles ranged from 29.2-41.4, while the average number of nonsynonymous
differences ranged from 2.8 – 5.0. The amount of variation that was observed in this single population
of butterflies was much higher than the variation that has been previously observed in Pgi in Drosophila
melanogaster and in other genes in C. eurytheme (Wheat et al. 2006). The authors contend that this
high level of polymorphism in Pgi is maintained by balancing selection via heterozygote advantage.
Because PGI is a dimer, two polypeptide chains must interact to form the functional enzyme. In a
heterozygote the two polypeptide chains are not identical, while in a homozygote the two polypeptide
chains are the same. In previous studies using another butterfly species, the enzymes produced by
heterozygotes were found to perform better than the enzymes produced by homozygotes (Watt 1983,
Watt et al. 1996). The authors connect this observation to their findings by suggesting that the dimers
formed by the heterozygotes suffer from fewer conformational constraints, which allows them to form
more effective active sites. They suggest that the homozygotes would be limited in the way that the
two chains could fit together because of the symmetry of the chains. The authors support this
contention using the results from the analysis of how the nucleotide substitutions that they found
altered the 3-D shape of the protein produced by the different alleles. The authors found that the
nonsynonymous nucleotide substitutions only occurred in regions of the gene that coded for amino
acids found on the exterior region of the enzyme. They also found that a few substitutions in particular
altered the shape of the active site of the enzyme. These substitutions typically involved a change in
charge of the amino acid, which changed the way that the amino acid interacted with its environment
and the other amino acids in the area.
The synonymous nucleotide substitutions were then analyzed to determine if selection was
maintaining the variation in the gene using a test called Tajima’s D. The Tajima’s D test is used to
determine if variation occurring between two DNA sequences is due to random (genetic drift) or
nonrandom (selection, population bottlenecks, population expansions) processes. The Tajima’s D test is
based on a model which uses the number of segregating sites and the average number of mutations
between two segments of a DNA to compute a test value. If the DNA sequence of interest is
experiencing neutral drift then the test value is expected to be zero. A negative test value is interpreted
to indicate a recent population bottleneck followed by population growth. A positive test value is
interpreted to indicate a decrease in population size or balancing selection. In this study the authors
calculated overall Tajima’s D values using the synonymous nucleotide changes in the four alleles, and
then performed a sliding window analysis where they calculated Tajima’s D values for small sections of
the DNA sequence. In the sliding window analysis the authors calculated a Tajima’s D value for small
segments of nucleotides within PGI. They then shifted the window along the DNA sequence and
calculated a new value. They performed this sliding window test for two exons of Pgi (exons 7 and 9) to
look for extreme changes in the Tajima’s D value. Overall, the Tajima’s D value for the Pgi allele was
found to be negative, which is not consistent with balancing selection acting to maintain variation in this
gene. However, areas of high positive values of Tajima’s D were found in exons 7 and 9 in the sliding
window analysis. This supports the idea that the balancing selection is acting on those sections of the
gene. In total all of these results were interpreted by the authors to indicate that the different alleles
observed did in fact produce versions of PGI that had different 3-D shapes that varied in function, with
the heterozygotes having a likely performance advantage that was responsible for maintaining the allelic
variation in the population observed in exons 7 and 9.
In a 2010 study Wheat found a similar result looking at polymorphisms in Pgi in a different
butterfly species. The 2010 study used the Glanville fritillary, Melitaea cinxia, as the study organism.
The change in model organisms was likely driven by the fact that the previous study was completed as
part of the author’s Ph.D. research, while the studies that I describe in the remainder of the paper were
completed working in a different lab as a post-doc. The switch of species was also not likely to have
been just a matter of changing jobs and universities, but also a strategically planned choice. The ecology
of the Glanville fritillary has been studies intensively since 1993; making this species an ideal candidate
for studies that bridge the gap between ecological and evolutionary patterns and genetics. It was
already known that different allozymes (forms of the enzyme) were present in the population and that
individuals with the different allozymes exhibited differences in flight metabolic rate, female fecundity,
and lifespan (Wheat et al. 2010). Wheat et al. (2010) analyzed the coding region of Pgi in 22 adult
butterflies collected in 2004 from 15 different populations from across the Aland Islands. Coding
sequences were amplified from cDNA and haplotypes were identified. The authors constructed a
neighbor network of the Pgi haplotypes that were found to try to answer questions about the
appearance of the genetic variation in the butterfly populations. They found 16 unique haplotypes in
samples taken from 22 individuals. The different allozymes of PGI corresponded to distinct haplotype
clades with the majority of the nucleotide differences occurring between rather than within the clades
(Wheat et al. 2010). They also performed an analysis to determine how long ago the different haplotype
groups shared a common ancestor. To do this they used a computer program called BEAST. Overall, the
authors found a higher amount of polymorphism in Pgi than would be predicted by neutral theory. This
is consistent with the hypothesis that balancing selection is acting in this population to maintain the
variation. The authors also went through extensive tests to rule out other possible mechanisms related
to changes in population size that may also account for the high polymorphism. They performed
analyses designed to model different scenarios of population change and they found that some
scenarios could produce the observed patterns of polymorphisms, but that it was unlikely that changes
in population size could account for the observed results. In addition when the results from this study
were compared to the results obtained in the study in C. eurytheme, a similar pattern of variation was
detected, but there were enough differences to suggest that the balancing selection observed in both
populations was the result of convergent evolution. This result lead the authors to conclude that the
polymorphism that they observed in PGI for both species of butterfly is due to long-term balancing
selection that occurs in many butterfly populations, and it is not strictly due to the population structure
of either butterfly population.
While the previous studies illustrated the power of using genetic techniques to help explain
patterns of evolution observed in natural populations, Pgi is not likely acting in isolation in the
population. To better understand the evolutionary and ecological implications of genetic differences in
populations, scientists would like to be able to look for differences in gene expression between
populations. One tool that is needed to do this is a microarray for the species of interest. Given the cost
and time needed to make microarrays, they have not generally available for nonmodel species like the
Glanville fritillary. New sequencing techniques have recently been developed that are cheaper and
faster. Vera et al. (2008) describe how they used 454 pyrosequencing to produce transcriptome for M.
cinxia. To do this, they started with RNA that was isolated from 80 larval individuals collected from a
diverse range of populations across the main Aland island. The authors then made a cDNA library from
the RNA pool using both the 454 pyrosequencing technique and traditional Sanger sequencing. The
authors found that the 454 pyrosequencing was able to produce results that could be used to construct
a microarray that showed both high repeatability and the ability to detect biological differences among
individuals (Vera et al. 2008). An unexpected additional result was that the authors also detected nonbutterfly cDNA in their analysis. This occurred because the RNA that was used to make the cDNA library
came from a whole insect. This whole insect DNA was screened for polyadenylation to eliminate DNA
from many microorganisms but some DNA from microsporidia did get sequenced as well. This was an
unexpected result because these butterflies were not known to carry microsporidia. The authors
suggest that this technique can also be used to detect parasites (xenobiotics) in host species.
The final paper that I will discuss builds on the previous work to apply genomics to questions
about ecology and evolution. The authors use a functional genomics approach to identify genes that are
thought to be involved in regulating differences in life history characteristics between butterflies that
are in established old populations and butterflies that are in newly colonized populations. This focus on
old and new populations is based on the idea that the selection pressures in these two populations are
different. Butterflies in the old populations likely face higher competition while butterflies that colonize
new habitats would need to fly long distances and reproduce before they die. These different selection
pressures should maintain both types of butterfly in the population. To test this idea using a genomic
approach Wheat et al. (2011) used the microarray that they developed using the 454 pyrosequencing
technique to look at gene expression in old and new populations. They found that the females from
new populations showed a higher expression of genes involved in egg provisioning (including larval
serum proteins and amino acid transporters) and production. They also found that new population
females had a higher level of juvenile hormone, which is important in triggering sexual maturation, and
they had a larger number of mature eggs. Together these results indicate that females from new
populations are able to reproduce faster and at an earlier age. In addition to the differences in gene
expression related to egg production, the authors also found differences related to flight performance.
There was a higher expression of proteins related to increased protein turnover in response to damage
caused by intense flight. The authors also found a difference in the frequencies of two alleles that are
important to metabolism. Individuals in the new population had a higher frequency of both a Pgi allele
and an Sdhd (succinate dehydrogenase) allele that are linked to more efficient ATP production.
Butterflies that had certain Pgi and Sdhd alleles had better metabolic capabilities compared to other
butterflies and were better able to fly long distances. Overall, the importance of these results is that
gene expression analysis was able to help uncover some mechanistic explanation of ecological trade-offs
(i.e. between competitive ability and dispersal) and proposed evolutionary patterns (disruptive
selection) that are thought to be at work in natural populations.
Literature Cited
Klepsatel, P., & Flatt, T. The genomic and physiological basis of life history variation in a butterfly
metapopulation. Molecular Ecology. 20, 1795-1798 (2011).
Metapopulation Research Group website retrieved from
http://www.helsinki.fi/science/metapop/Research/Project_metapop.htm
Wheat, C.W., Frescemeyer, H.W.,Kvist, J., Tas, E., Vera, J.C., Frilander, M.J., Hanski, I. & Marden, J.H.
Functional genomics of life history variation in a butterfly metapopulation. Molecular Ecology. 20,
1813-1828 (2011).
Wheat, C.W., Haag, C.R., Marden, J.H., Hanski, I., & Frilander, M.J. Nucleotide polymorphism at a gene
(Pgi) under balancing selection in a butterfly metapopulation. Molecular Biology and Evolution. 27,
267-281 (2010).
Wheat, C.W., Watt, W.B., Pollock, D.D., & Schulte, P.M. From DNA to fitness differences: sequence and
structures of adaptive variants of Colias phosphoglucose isomerase (PGI). Molecular Biology and
Evolution. 23, 499-512 (2006).
Vera, J.C., Wheat, C.W., Frescemeyer, H.W., Frilander, M.J., Crawford, D.L., Hanski, I. & Marden, J.H.
Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing.
Molecular Ecology. 17, 1636-1647 (2008).
Watt, W.B. Adaptation at specific loci II. Demographic and biochemical elements in the maintenance of
the Colias PGI polymorphism. Genetics 103, 691-724 (1983).
Watt, W.B., Donohue, K., & Carter, P.A. Adaptation at specific loci. VI. Divergence vs. parallelism of
polymorphic allozymes in molecular function and fitness-component effects among Colias species
13, 699-709 (1996).
Wikipeia, Tajima’s D. retrieved from http://en.wikipedia.org/wiki/Tajima%27s_D
Download