Convergent Evolution of PFKP Gene

advertisement
Robert Dougherty
Analysis of the Evolution of the PFKP gene in the catarrhine and
platyrrhine primate lineages
17 April 2013
Introduction
Molecular convergent evolution is a fascinating biological phenomenon because it
suggests that evolution can proceed in a predictable fashion at the highest level of specificity.
Understanding the predictability of the evolutionary processes that allow organisms to adapt to
novel environments holds implications in ecology; how might certain organisms evolve to cope
with global warming? It also holds implications in astrobiology; how might life evolve on a
different planet?
Though it may not hold answers about extraterrestrial life, it is interesting to study cases
of convergent evolution in primates due to their close relationships with Homo sapiens.
Episodic Adaptive Evolution of Primate Lysozyme[1] analyzed the convergent evolution of
foregut lysozyme production in both colobine primates and ruminants. Convergent evolution
between two particular groups in the order primates, new world monkeys and old world
monkeys, is intriguing because it begs the question as to why hominids branched from the old
world monkey lineage, but nothing close to hominids ever diverged from the new world
monkey lineage. The convergent evolution of certain characteristics between new world
monkeys and old world monkeys has been studied already, like the histability complex in the
paper Convergent Evolution of Major Histocompatibility Molecules in Humans and New World
Monkeys. [2] In this paper genetic distances were calculated by the Kimura method and
phylogenetic trees were drawn by the neighbor joining method in MEGA as part of their tests of
convergence.
The exercise in molecular evolutionary analysis presented in this paper attempts to
diagnose the scale of convergent evolution of primates in the new world monkey lineage and
the old world monkey lineage in the PFKP gene. The PFKP gene codes for the PFKP enzyme,
which catalyzes the irreversible conversion of fructose 6-phosphate to fructose 1,6biphosphate. This is a major step in glycolysis, as well as in the digestion of fruit. Some
primates, new world lineage and old world lineage, rely heavily on fruit in their diet, but not all
primates in these lineages do.
The hypothesis of this exercise is that a phylogenetic tree of primates constructed based
on relatedness of homologous PFKP genes will not reflect the primates’ actual lineage, but
rather group primates together based on their dependence of fruit in their diet. Convergent
evolution observed between new world and old world monkeys in this analysis would be
especially significant since it is believed that the two groups diverged from each other 40
million years ago when some monkeys from Africa made it to South America. By answering this
hypothesis, a greater understanding of the strength of convergent evolution of dietary
adaptations in primates can be had. The knowledge gained by answering his question may also
extend to furthering the understanding of the divergence of hominids from the old world
monkey lineage.
Picture 1: Primate Phylogeny [3]
Description of Data
The gene sequences were obtained from the national center of biotechnology
information website. Sequences were obtained for eight primate species; six from the old world
lineage and two from the new world lineage. The six primates from the old world lineage
(catarrhine) included humans (Homo sapiens), common chimpanzees (Pan troglodytes),
bonobos (Pan paniscus), Sumatran orangutans (Pongo abelli), gibbons (Namascus leucogenys)
and olive baboons (Papio anubis). The two primates from the new world lineage (platyrrhine)
included marmosets (Callithrix jachus) and South American squirrel monkeys (Saimiri
boliviensis).
The first attempt to get data was done using the entire nucleotide sequences of the
PFKP genes of the seven primates but they proved to be too large for the available alignment
software to handle. After trial and error with some introns and exons of the gene, it was finally
decided that the amino acid sequences of the protein encoding gene would be used to test the
hypothesis since the data was equally usable for all primate species.
Table 1: Description of Gene Sequences
Species
Scientific name (Common Name)
Homo sapiens
Pan troglodytes
Pan paniscus
Pongo abelli
Namscus leucogenys
Papio anubis
Callithrix anubis
Saimiri boliviensis
Description of Data Analysis
Genbank Accession Number
NC_000010
NC_006477
NW_003870569
NC_012601
NC_019824
NC_018160
NC_013902
NW_003943713
The amino acid sequences of the eight analogous protein coding genes were copied
from the ncbi website and pasted into a text document. The compiled text file was then pasted
into a clustalw alignment generator. After this alignment file was obtained it was converted into
a MEGA supported file by the MEGA5 program. From here data analysis could be done. The first
analysis done was the obtainment of pairwise distances between the amino acid sequences
using the Poisson method to get a general idea of the relatedness of the genes. Then a
neighbor-joining tree using the Poisson method was made with 1000 bootstrap replicas to see
how these genes were related to each other from an evolutionary point of view. Then a
maximum parsimony tree was generated with 1000 bootstrap replicas for the same reason as
the neighbor joining tree and also to have another phylogenetic tree for comparison.
Results of Data Analysis
Table 2: Estimates of Evolutionary Divergence Between Sequences (Poisson)
(1)
HomoSapiens (1)
PongoAbelli (2)
0.005
PanPaniscus (3)
NamascusLeucogenys (4)
PapioAnubis (5)
CallithrixJacchus (6)
SaimiriBoliviensis (7)
PanTroglodyte (8)
0.003
0.026
0.009
0.035
0.023
0.096
(2)
(3)
0.005
0.029
0.012
0.038
0.026
0.097
0.026
0.009
0.035
0.023
0.093
(4)
(5)
(6)
(7)
0.033
0.057 0.038
0.048 0.026 0.025
0.119 0.102 0.125 0.113
From MEGA5: “The number of amino acid substitutions per site from between
sequences are shown. Analyses were conducted using the Poisson correction model. The
analysis involved 8 amino acid sequences. All positions containing gaps and missing data were
eliminated. There were a total of 776 positions in the final dataset.”
The paired distances table shows a close four-way relationship between Homo sapiens
(humans), Pongo abelli (Sumatran orangutans), Pan paniscus (bonobos) and Papio anubis (olive
baboons).
Tree1: Maximum Parsimony Tree [3]
The above tree is a maximum parsimony tree generated for the gene data. The only
node with very robust data is the node connecting the new world monkey branches.
Statistically robust nodes are nodes that have a high bootstrap number next to them. The
number indicates the percentage of the bootstrap replicates generated that included that
specific node.
Tree 2: Neighbor-Joining Tree (Bootstrap Consensus)
Shown above is the bootstrap consensus neighbor-joining tree generated for the gene
data. This tree yields different connectivity than the maximum parsimony tree. Once again the
only robust node is the one connecting the new world monkey branches.
Tree 3: Neighbor Joining (Bootstrap Consensus) – No Chimp
The above tree is the bootstrap consensus neighbor joining tree that is generated
without including the chimp data. The neighbor joining tree was run more time this way with
the hopes of getting more robust data, since the gene data for the common chimp does not
share a close similarity with any of the other gene data. The nodes gain robustness, especially
the one linking humans, bonobos, baboons and gibbons.
Tree 3: Maximum Parsimony - No Chimp
This time the maximum parsimony tree was generated without including the gene data
for the common chimp. The tree experiences a decline in robustness from the tree that
included chimps. Once again the trees generated using the two phylogenetic methods show
different connectivity.
Table 3: Tajima’s Test Statistic, D
m
8
S
ps = S/m
T = ps/a1
p
D
119 0.153351 0.059143 0.043906 -1.404662
m = number of sequences, S = number of segregating sites, ps = S/m, T = ps/a1, p = nucleotide diversity , D = Tajima’s test stat
The test statistic D < 0 implies that directional selection is occurring among the
sequences.
Discussion
It has already been mentioned that the table of pairwise distances showed a great deal
of similarity between humans, bonobos, Sumatran orangutans and olive baboons. It comes as a
surprise then that the gibbons are left out of this strong connection since gibbons share a more
recent common ancestor with humans, bonobos and orangutans than baboons do. It comes to
even a greater surprise that the common chimp is excluded from this group since they are the
closest relatives to humans and the other species from their genus, the bonobos, are included.
Other than evolutionary history, bonobos and orangutans are linked by their diet, which
consists mainly of fruit. Baboons don’t rely as heavily on fruit as bonobos or orangutans, but
then again neither do humans. Since the common chimp is more of a meat eater than bonobos
and orangutans and its PFKP gene has diverged greatly from these aforementioned primates, it
can suggest that the direct ancestors of humans had a fruit driven diet. It could also suggest
that the PFKP gene is under different selective pressures when species start to substitute meat
for fruit in their diet. Tajima’s D statistic for the gene sequences is -1.40462, which does suggest
that directional selection is occurring. [4][5][6][7][8]
The two primate species included in this analysis that are from the new world monkey
lineage are the only two species that have the same linkage in all of the phylogenetic trees
generated. This is fitting from an evolutionary point of view since these two species are more
closely related to each other than to any of the other species included in this genetic analysis.
They also have a similar diet of insects and plant exudates. [9][10] One conclusion that can be
drawn from this is that the evolution of the PFKP gene has been proceeding at a constant speed
in the new world monkey lineages but not in the catarrhine lineage. This could suggest that the
PFKP is under greater selective pressure in the catarrhine lineage for a reason that is not yet
known.
Though some preliminary conclusions can be drawn from the data generated, there is
not enough congruent data to prove or disprove the hypothesis presented earlier that a
phylogenetic tree of primates constructed based on relatedness of homologous PFKP genes will
not reflect the primates’ actual lineage, but rather group primates together based on their
dependence of fruit in their diet. Further studies that can be done on this hypothesis could be
examining different genes that code for dietary enzymes or using molecular clock methods
instead of phylogenetic tree methods.
References
1) http://www.popdna.zi.ku.dk/evolbiology/courses/4/lysozym/Messier.pdf
2) Klein, Kriener, O’hUigin, Tichy (200). “Convergent evolution of major histocompatibility
complex molecules in humans and New World monkeys” Immunogenetics 51: 169-178
3) http://www.ec.europa.eu
4) Ihobe H (1992). "Observations on the meat-eating behavior of wild bonobos (Pan
paniscus) at Wamba, Republic of Zaire". Primates 33 (2): 247–250.
5) http://web.archive.org/web/20080917132740/http://www.awf.org/content/wildlife/de
tail/baboon
6) http://www.theanimalspot.com/sumatranorangutan.htm
7) http://pin.primate.wisc.edu/factsheets/entry/bonobo
8) http://pin.primate.wisc.edu/factsheets/entry/chimpanzee
9) http://marmoset.mynumber.biz/index_files/Page741.htm
10) http://www.edu.pe.ca/southernkings/sqmonkey.htm
Download